We have developed a system and various methods for creating and using interlocking trees datastores and various features of said interlocking trees datastores. We refer to an instantiation of these interlocking trees datastores that we have developed as a “KStore” or just “K”. In particular, these structures and methods have been described in copending patent applications U.S. Ser. Nos. 10/385,421, (now published as US 20040181547 A1) and 10/666,382, by inventor Mazzagatti. Additionally, we described a system in which such interlocking trees datastores could more effectively be used in U.S. Ser. No. 10/879,329. We hereby incorporate these referenced patent documents in their respective entireties into this patent by this reference. While the system and method we describe in this patent relate with particularity to the specific interlocking trees datastores which inventor Mazzagatti hereof described in the above-referenced patent (applications), it should be readily apparent that the system and methods described herein may also be applicable to similar structures.
While the interlocking trees datastores structure created by the system originally described in co-pending patent application U.S. Ser. No. 10/385,421, and the means for the Learn Engine taught in co-pending patent application U.S. Ser. No. 10/879,329 to accept multiple fields, hereinafter also referred to as columns, of data within a single data record to create the interlocking trees datastore structure, provides many useful traits for understanding and using the inventions described, heretofore the use of multiple fields within a single record has had an inherent difficulty in use.
As taught by copending application U.S. Ser. No. 10/879,329, the KStore Learn engine can accept multiple fields of data within a single data record, such as an entire transaction record consisting of fields such as salesperson, Day, Item number, disposition, and state. In the just described example, if the “columns” of the record (i.e., the field names) contained data that were dissimilar in nature, i.e. numbers, days, names, etc., the previous method of creating and updating the interlocking trees datastore, hereinafter referred to also as ‘KStore’, as taught by the patents referenced above were not capable of maintaining the distinction, or relationship, between the field variable and the field name context. This is a difficulty in the previous teachings because of the likelihood or possibility that similar data can appear in different fields or columns, thus providing a potential for confusion in interpretation. As currently taught, the inherent human-comprehensible distinction between the fields can be lost when the Learn Engine accepts incoming data records with multiple columns if more than one column contains data similar to (or with the same variable values as) that contained in other columns within the same or different record. For example, if in one column a record has a price field and in another record a column represents a price of something else, conflation and resultant confusion is a strong potential in interpreting the data from the KStore.
Such confusion can hinder the ability of the user to properly analyze the data contained within the Kstore. A number that represents one concept in a certain column can represent an entirely different concept in another column. Without integrating the distinction between the columns in the Kstore data structure, when performing analytic functions on the Kstore, there is a risk that, for example, information from two columns may even be recorded as a plurality of occurrences of a single event, even though they should be recorded as multiple events.
Given that the data structure did not exist prior to the teaching of the applications lists above, there did not exist a solution for recovering the distinction between the columns by integrating the knowledge of the column context into the Kstore structure. Accordingly, a new method of integrating the knowledge into the Kstore structure was designed. Therefore, the ability to integrate the field context distinction in the Kstore was unavailable prior to the invention described in this patent.
It should be clear that a single data record recorded into a KStore might contain columns of data defined within varying contexts. If the user does not expect or anticipate that data contained within a column is similar in nature, e.g. both columns have numbers, to data contained within a different column in the same or different data record, then it may not be useful to integrate the distinction between the columns in the Kstore data structure. Take for example, a data record that contains the fields “salesperson,” “Day,” “Item number,” “disposition” and “state.” It is readily apparent how the data records in this example contain dissimilar data amongst the columns, and subsequently, from accompanying data records or stored into the same Kstore. If however, the user expects that there exists the likelihood or possibility that different data columns can contain similar types of data but represent different meanings, or contexts, it may be important to integrate the distinction of the column contexts into the Kstore data structure. Take, for example, the data records containing the following fields: “salesperson,” “Day,” “Item number,” “disposition” and “price.” The user would likely anticipate the possibility that the price field may contain data similar in nature and amount as the Item number. In this case, if the data for the price and Item number were the same, say across two different records, i.e. the same number like for example “60”, the number of occurrences for the price 60 could be combined with the number of occurrences of 60 where it represents an item number. Thus the root node with the value 60 would have a count of two 60s. This loss of distinction may frustrate analytic queries when focused on the data number (i.e., when trying to find how many 60's there are in an inventory by looking first or only at the root node 60).