Inverted Index Databases (or IIDBs) are well known. An example IIDB is the well-known Open Source software “Lucene,” that uses an inverted index to perform rapid searches of a collection of records (Lucene is provided by “The Apache Software Foundation,” a not-for-profit Delaware corporation, with a registered office in Wilmington, Del., U.S.A.). IIDBs like Lucene are sufficiently efficient and scalable such that they can be used for searching a large-scale corpus, a function provided by web-accessed search engines.
A limitation of IIDBs like Lucene is that the only inherent structural relationship supported, between records, is the single-level linear collection. It should be noted that the basic item of indexed data, supported by generic Lucene (i.e., Lucene that lacks the present invention), is called a “document.” However, herein, for purposes of generality, we shall refer to the basic item of indexed data as a “record.” Each record of an IIDB is identified by a unique ID number (where Lucene currently has capability to store up to 231 records, since the unique ID for each record is a 32 bit signed integer).
It would therefore be desirable to augment IIDBs to permit efficient representation of structural relationships, between records of an IIDB, that are more complex than just a single-level linear collection.
An important use of IIDBs is the searching of a “Corpus of Interest” (or C_of_I) for mentions of an “Object of Interest” (or O_of_I). A particular type of O_of_I is a brand of consumer products (also referred to herein as a “Consumer Brand” or “C_Brand”). C_Brands can be the subject of large-scale database searches, particularly of Internet content, by Brand Managers (persons responsible for the continued success of a C_Brand). In particular, a Brand Manager is often interested, for example, in the sentiment of consumers toward his or her C_Brand.
The names of many C_Brands, however, can be ambiguous.
Ambiguity, in a lexical unit, means that the same lexical unit can have two or more distinctly different meanings. Some example C_Brands, with ambiguous names, include the following:                “Tide”:                    C_Brand meaning: a laundry detergent            Example alternate meanings:                            the tide of the ocean                a football team, called “Alabama Crimson Tide”                                                UPS:                    C_Brand meaning: a package-delivery service            a Example alternate meaning: a direction of motion away from the earth                        Visa:                    a C_Brand meaning: a credit card company            a Example alternate meaning: a official document allowing entry into a foreign nation                        
It would therefore be highly desirable to provide techniques for the formulation of queries that are more precise at the identification of an O_of_I (such as a C_Brand), while still achieving a high level of recall.