In the prior art, it has been well known that computer systems can be used to manage indices to records of databases. Many techniques are known to parse, index and search indices to databases.
In recent years, a unique distributed database has emerged in the form of the World-Wide-Web (Web). The database records of the Web are in the form of pages accessible via the Internet. Here, tens of millions of pages are accessible by anyone having a communications link to the Internet.
The pages are dispersed over millions of different computer systems all over the world. Users of the Internet constantly desire to locate specific pages containing information of interest. The pages can be expressed in any number of different character sets such as English, French, German, Spanish, Cyrillic, Kanakata, and Mandarin. In addition, the pages can include specialized components, such as embedded "forms," executable programs, JAVA applets, and hypertext.
Moreover, the pages can be constructed using various formatting conventions, for example, ASCII text, Postscript files, html files, and Acrobat files. The pages can include links to multimedia information content other than text, such as audio, graphics, and moving pictures.
Some of the information of the pages may be numeric in form. That is, the numeric values may assume a range of values. For example, the size of a page may be a searchable item. Also, the date associated with the page can be expressed as a numeric term so that a user can search for a page having a date in a specific date range, e.g., this year, not older than five years, and so forth.
Most prior art systems treat the indexing of literal and numeric values separately. Literal values are typically stored in index entries as character strings and numeric values are stored as numerals. The search operations used for literals is typically character comparison, and for numerics mathematical techniques are used.
Having separate interfaces for literals and numeric range-based values increases the complexity of the system. In addition, the results for the literal and numeric indices are separately determined, and need to be combined in some type of merge operation.
It is desired to provide an index structure which provides a single unified data structure and a single interface for literal and numeric values.