The present invention relates generally to systems and methods for searching data, and more particularly to systems and methods for enhancing search results and text searches over structured data in a web application.
Due to the popularity and accuracy of current web search technology, users have come to expect quick up-to-date presentation of search results with the most relevant results presented at or near the top of the search results page. Web applications inevitably come with a similar set of expectations. Although in many regards a comparison is faulty, for example a Web application's data set looks very different compared to most web pages, with the exception of attachments, documents, and notes. Regardless of differences, customers almost certainly don't realize or care about this, and expect the same functionality out of a web application-based search. Despite all of the differences, there is much in common from an input box available on every page to how it is used primarily in order to find a specific record. This is in fact a common usage pattern of web search known as a navigational search and it is something web search engines are quite good at. One of the reasons they excel in this area is because they use other information besides the text on the web page itself in order to do the scoring (link text is one good example of this).
The ordering of search results in web applications may not always rank the most relevant results at or near the top of the results page. Search results are typically ranked based solely on a “relevancy” score given by the search engine. One example of a useful search engine for use in web applications is Lucene from The Apache Software Foundation. Lucene is a text search engine library written in Java, and is suitable for nearly any application that requires full-text search. With Lucene, for example, the score is calculated using a standard information retrieval algorithm based on many factors. While this score may be quite useful in the overall rankings, the search engine doesn't take many factors outside the scope of the index into consideration.
Although the score provided by the search engine is quite useful, it does have limitations. Because a search is the most common means of end-user navigation to a specific record in a multi-tenant database system, such as that provided by salesforce.com, it is desirable to provide more relevant results in response to a search request, and thereby increase end-user productivity and satisfaction with the search functionality. This would also reduce the load on the system if users can find the record they want without having to go to the detail pages of multiple results.
Also, in systems where searching of structured data is implemented, such as in a multi-tenant database system, search indexing latency can often be a problem, especially where a user who recently added or modified data immediately searches for items using a term that should return a recently modified data entry. In the salesforce.com system, for example, search queries are run against a search index that is a replica of an organization's data. As organization data is added or changed, a background process (a search indexer) asynchronously updates the search index. Under peak system load, the volume of data change in the system may be so high that the search index update process can run behind, e.g., 2 to 5 minutes or more behind. As a result, there may be a lag time, e.g., 2 to 5 minutes or more, between the time data is entered or changed in the system and the time it may appear in search results. This is especially inconvenient for users when they make a change to the data (e.g., modify, add, delete) then immediately search for the data and are unable to find it because of search indexing latency.
Therefore it is desirable to provide search systems and methods that overcome the above and other problems. For example, it is desirable to provide search systems and methods that eliminate or reduce search indexing latency. It is also desirable to provide search systems and methods that enhance the relevancy of results returned.