Modern computer online information providers typically require the ability to search vast quantities of data. For example, the American legal system, as well as some other legal systems around the world, rely heavily on written judicial opinions, the written pronouncements of judges, to articulate or interpret the laws governing resolution of disputes. As a consequence, judges and lawyers within our legal system are continually researching an ever expanding body of past opinions, or case law, for the ones most relevant to resolution or prevention of new disputes. Found cases are studied for relevance and are ultimately cited and discussed in documents, called work product, which, for example, advocate court action, counsel clients on likely court actions, or educate clients and lawyers on the state of the law in particular jurisdictions.
Additionally, knowledge management systems, document management systems, and other online data providers typically require information from data sets that may vary in size from large to small. Data sets in the terabyte range are no longer uncommon. For example, some systems may utilize public records comprising approximately 1.2 terabytes of unique data, and tax and accounting (TA) data that includes approximately 20 gigabytes (GB) of unique data. In previous systems, problems have occurred because the system can typically store only five percent of unique public record data. Further, the system is too big for unique TA data, which typically shares server space with other data providers.
Such variances in data set and system sizes has an impact on search-engine performance, especially related to enterprise-server implementations (including inherent availability issues). For example, if a memory fault occurs within a system's CPU, the system typically cannot run the search service until the fault is resolved, and failover mechanisms are problematic. Because the search service is typically memory-intensive and not bound to the CPU, resources are wasted resolving these fault issues.
Furthermore, at times query processing forces the search engine to access a disk for data pages if they are not available in the file system cache. While in some cases data typically can be found in the file system cache if the data set is small enough to be completely held in RAM, it is often the case that data sets are so large that query processing often occurs at the disk level rather than the file-system-cache level. Further, current architectures typically do not ensure that the same search engine will process the same data consistently, which negates search-engine caching advantages.
Accordingly, the present inventor has identified a need for better systems, tools, and methods of providing search functions within online delivery platforms.