Full text search engines, commonly used for indexing and searching sets of documents such as web pages on the Internet, index “strings” of alphanumeric characters (usually whole words or strings of characters between delimiters such as white space and punctuation.) Numeric information such as measurements (height, mass, length, temperature), monetary values, and other information is indexed in such search engines as their string representation and searching for such values is performed on a character-by-character basis. For example, although the numeric value of 01.23 and 1.230 are equivalent (although the notation may be indicative of measurement precision), searching for one string will not return the other string. Additionally, when the values 1, 2 and 15 are indexed as strings, 15 is sorted between 1 and 2: the first numeral of “15” is less than “2”; and the total numeral value of “15” is greater than “1”. Searching for data in numeric order, searching for numeric values between a lower and upper bound, and other operations in such an index are impossible, where a fixed width index is imposed.
What is needed is a system to represent all ASCII characters by a modified set of such characters in which no distinction is made, within the representation itself, between representation of a numerical value, with or without delimiters such as decimal points and commas, and a non-numerical expression: and in which a numerical value such as “15” is automatically assigned a location such that “2” lies between “1” and “15”.