Notwithstanding the significant advances made in the past decades, electronic document and database technology continues to suffer from a number of disadvantages preventing users from fully realizing the benefits that may flow from advances in computing and related technology.
Current Web search employs both query-independent and query-dependent processes. Query-independent processes such as Google™ PageRank™ focus search results on well-cited or otherwise significant portions of the Web. With such focus, query-dependent processes developed for text search perform reasonably well. However, Web content is far more highly configured than plain text documents. Web pages typically contain complex content items that contain other complex content items. By their nature, text search processes ignore a great deal of useful information. Similar observations apply to markup search more generally, to keyword search over databases, and to database search more generally, especially for databases that have been subject to data mining. Prior extensions of text search processes take note of the relatively simple hierarchical configurations of classic text documents, and take note of inter-document configurations within collections of text documents. However, these prior extensions are not equipped to fully and efficiently use available configurational information.
U.S. patent application (USPA) No. 2007-0288438, and USPA No. 2009-0254549, among other things, introduced a new category of query-dependent search processes for configured content. These processes systematically apply configurational information and work in conjunction with text search processes, content valuation processes, database query processes, clustering processes, and other prior art technology. They enable more accurate and more focused search results. They also enable more highly specified search expressions. They also support the application of automatically generated search expressions that indicate nested juxtapositions of sub-expressions, such as those generated by methods introduced in USPA No. 2013/0103662. While the processes of USPA No. 2007-0288438, and USPA No. 2009-0254549 allow search matches within a given content item to influence the relevance scores of content items that neither contain nor are contained by the given content item, these processes' sensitivity to the possibilities of mutual influence potentially can be further improved.
The use of configurational information necessarily requires computational resources. So does the evaluation of complex search expressions. The processes of USPA No. 2007-0288438, and USPA No. 2009-0254549 maintain efficiency in evaluating complex search expressions over complex content hierarchies. However, additional opportunities for optimization remain, as do opportunities for shifting the computational burden to systems that operate prior to search time, opportunities for distributing both search-time processing and pre-search-time processing among various system instances, and opportunities for maintaining consistency in the assignment of numerical relationships within and across content hierarchies.