In the present business environment, corporations are demanding more real-time information to make sound and time-critical business decisions. Business Intelligence (BI) has been used to support and improve such business enterprise decision making. BI tools are commonly applied to business data, such as sales revenue, costs, income, or other financial data. These tools are designed to spot, retrieve and analyze business data and to provide various historical, current and predictive future views of business operations. Common functions of BI tools include reporting, data exploration, data mining, data cleansing, information management, and business performance management. Many BI tools create, maintain, or consume files such as documents, reports, dashboards, and the like.
With the enormous amount of data and information available in our data centers today, it is not only important for the customers to be able to search the data in a matter of few milliseconds, but also be able to navigate through the information in a meaningful and organized manner. The data deluge that every enterprise experiences today calls for more and more efficient and accessible ways to find the needle in the haystack. As the overall number of documents stored in BI repositories grows, so does the average number of documents returned by each search.
FIG. 1 illustrates how the number of matching search results may mirror the exponential growth in overall content. Organizations may have exponential growth, or forecast exponential growth. As data grows, the number of matching documents returned by the search engine grows. For example, if a broad search matches 10% of the documents, then it would match 100 documents out of a total of 1,000 documents. If the same search is run later when there are now 2,000 total documents, it will return in the order of 200 documents in the results. Even though this may not be the case for every search (e.g., overall makeup of the documents may change over time, or specific terms may come and go), the number of matching documents generally tends to grow mirror the overall data growth. Better relevancy or “document ranking algorithms” will certainly buy some time; but each time the number of matching documents increases, the effectiveness of even the best algorithms will eventually fail.
The generation of facets (or categories) along with the search results has added a new paradigm to the way users find information. Each facet associates a value with each item in the search results, which enables the grouping of the search results into different categories. There are two types of facets: metadata and content. Metadata facets are generated from metadata fields, such as type of document, document creation time, location of document, author, etc. Content facets are generated from the content inside BI artifacts. Although facets provide a useful tool for users to filter search results, there are many limitations with conventional facet generation techniques. For example, a large number of facets have to be generated for matching BI artifacts that have no particular structure. This can be overwhelming for the end user, and makes it difficult to navigate to find the exact BI artifacts one is looking for. In addition, current implementations of facet hierarchy in search engines (e.g., Amazon, Ebay, etc.) have a static structure and a fixed number of levels, providing no way of dynamically generating content facets based on matched BI artifacts for a given search query.
Even further, there is no proper grouping of similar facets, even though there are many interrelated facets in databases. For example, a hierarchy found in the underlying data is not leveraged to dynamically link facets in other BI documents. Similar or incomplete hierarchies found in different BI artifacts are not grouped contextually to provide more meaning. Further, there is no provision for end users to manage the structure or name of the facets, especially with customer-specific or company-specific terminology for the data. For example, “Location” & “Geography” may mean the same thing for a specific customer. However, if one wants to group these facets, conventional technology does not allow for it.
Accordingly, there is a need to provide an improved indexing and searching technology that allows for more meaningful grouping of facets and to provide context to search results.