This invention relates to software that interfaces to information access platforms.
A search engine is a software program used for search and retrieval in database systems. The search engine often determines the searching capabilities available to a user. A web search engine is often an interactive tool to help people locate information available over the world wide web (WWW). Web search engines are actually databases that contain references to thousands of resources. There are many search engines available on the web, from companies such as Alta Vista, Yahoo, Northern Light and Lycos.
In an aspect, the invention features a method of accessing information from a collection of data including receiving a query, generating an inverse index of the collection of data and generating results to the query in conjunction with the inverse index. Generating the inverse index includes storing a canonical non-terminal representation of the data in the inverse index. Generating the inverse further includes storing hierarchical information generated from the collection of data, and applying a parser and grammar rules to the collection of data to produce a canonical non-terminal representation of the data.
Generating results includes applying the parser and the grammar rules to the query to produce a query canonical form and matching the query canonical form to the canonical non-terminal representation of the data in the inverse index.
Embodiments of the invention may have one or more of the following advantages.
An information retrieval process purpose is to take a collection of documents on a main server collection of data containing words, generate an inverse index known as an IR index, and use the IR index to produce answers to a user query. The process may then leverage grammar it develops for front end processing when building the IR index to generate phased synonyms (or phrased aliases) for the document. More specifically, the process may apply the parser and grammar rules to the document before the IR index is built.