Information retrieval for numeric data, particularly from unstructured content sources, presents special challenges to search engines. Many of those challenges are described and addressed in our co-pending U.S. patent application Ser. No. 12/496,199, System and Methods for Units-Based Numeric Information Retrieval, incorporated herein by this reference (hereinafter referred to as the “Co-Pending patent”). Due to the prominence of generic keyword based document retrieval and search engines, users have been trained to input query keywords and retrieve documents after entering only a keyword phrase and no additional contextualization information. Thus, the type of data sought in a query—such as the relevant unit of data—may not be specified by a typical user. For applications where a user seeks information from a corpus containing numeric data but has not specified a relevant unit of data of interest, there is a need for an information retrieval system that can automatically determine this unit.
Therefore a system which can automatically determine the appropriate unit to be associated with numeric data being searched and retrieved in response to a simple keyword-only query (i.e., with no explicitly identified unit of data, and preferably even with no numbers being specified), without requiring the user to provide any additional contextualization, is of great value and importance.