The computing industry has made it possible to access virtually any document or text with a computing device through the Internet or other computing network.
Although the computing processes required to store and retrieve electronic documents are well known, the sheer volume of documents and data stored in some databases can still make it difficult to properly index and find the desired content in a timely fashion. This is particularly true when considering that many databases contain documents with similar or identical content, thereby increasing the difficulty and processing required to distinguish between the various documents.
To facilitate the retrieval of electronic content from a database, several types of search engines and indexes have been developed. Initially, traditional databases were developed with fields of information that could only be searched through a rigid alphabetically ordering of associated fields, such as a last name field. Later, full text indexing was developed to provide greater searching flexibility. With a full text index all words are indexed and can therefore be searched for within a document or record. With full text indexing, it is also possible to simultaneously or sequentially search for a plurality of search terms within the indexed documents and to hopefully improve the accuracy of a search.
While full text indexing and modified field indexing have provided significant improvements over traditional indexes that only permitted rigid alphabetical searching of fields, there is still a significant need for improvement in searching technology. This is true, not only for Internet and traditional databases, but for any computing process or application in which data is retrieved from a repository of any type.
Bottlenecks that slow down the searching processes, for example, can be created by the limitations of computer hardware and connections. In particular, computer processors are limited in the number of calculations per time unit (e.g. calculations per second) that can be performed. Networks are also limited in the amount of data per time unit that can be transmitted across the network. Even storage devices are limited by the number of I/O operations that can be performed within a given time. Memory devices are also limited in the amount of information that can be stored at a given time.
Accordingly, in view of the processing limitations of computing devices, networks and storage, there is a significant need to provide new techniques for searching that will improve the overall speed and accuracy of a search and for reducing the computational expense of performing the search or other related computing process. Improving the speed and accuracy of searching processes would also result in a beneficially appreciable consumer experience.
Existing searching paradigms continue to be constrained by the philosophical approaches after which they were modeled. For example, existing search paradigms are designed to perform searching on demand or on-the-fly, only after the search query has been received. While this approach is somewhat logical, because it is unknown what to search for before the query is received, this approach delays computationally expensive processing which is noticeable to the consumer.
Existing philosophical approaches to searching also require a significant amount of irrelevant processes to ensure that the search is comprehensive. In effect, the existing searching techniques require a very deliberate and sequential sweep of the data repositories that are being searched, by looking in every ‘nook and cranny’, if you will, to help ensure the search is comprehensive. This blanket searching, however, wastes a lot of processing time and expense looking for the data in places were the data is unlikely to be found. However, because the existing searching techniques are directed to identifying where the data is, they fail to appreciate the value of knowing where the data is probably not.
In view of the foregoing, it is clear that the industry is in need of new approaches to searching for data.