We have developed a system and various methods for creating and using interlocking trees datastores and various features of said interlocking trees datastores. We refer to an instantiation of these interlocking trees datastores that we have developed as a “KStore” or just “K”. In particular, these structures and methods have been described in copending patent applications U.S. Ser. Nos. 10/385,421, (now published as US 20040181547 A1) and 10/666,382, by inventor Mazzagatti. Additionally, we described a system in which such interlocking trees datastores could more effectively be used in U.S. Ser. No. 10/879,329. We hereby incorporate these referenced patent documents in their respective entireties into this patent by this reference. While the system and method we describe in this patent relate with particularity to the specific interlocking trees datastores which inventor Mazzagatti hereof described in the above-referenced patent (applications), it should be readily apparent that the system and methods described herein may also be applicable to similar structures.
While the interlocking trees datastores structure created by the system originally described in co-pending patent application U.S. Ser. No. 10/385,421, and the means for the Learn Engine taught in co-pending patent application U.S. Ser. No. 10/879,329 to accept multiple fields, (hereinafter also referred to as columns) of data within a single data record to create the interlocking trees datastore structure, provides many useful traits for understanding and using the inventions described, it has come to our attention that the system throughput during the learning processing can be greatly increased if there were some way to reduce processing of duplicate information. In doing so we could reduce the need to particlized as well as the amount of traversal of the KStore to record certain events.
The problem we noticed can be generally stated as relating to record sets with ordered field variables, where only a relatively small number of the field variables change from one record or sequence to the next record or sequence. (A “record” or other grouping of variables is generally referred to herein as a sequence. This could be a tune in a database of audio files, a cache line in a memory dump or any other “record”—like sequence. In a field record universe of, for example, sales data, the variables in a sales transaction would be such a record or sequence.) We felt we should look at these kinds of data streams and record sets to find solutions. A particular situation which has this problem is found where the data stream has tables within it, where each of the table entries will branch many times from one particular point within the record of the data stream. Another example would be where a single individual makes many purchases. The records of this individual's purchases would have a branch at each purchase, with perhaps a date, price and item number for each record, but an initial (or other) part of the record, with name address etc, would be identical. For these data streams and record sets it was apparent to us that the time to accomplish learning of an event particle sequence for field record data that we were experimenting with was much higher than it needed to be. It also seemed as if there should be a way to make it faster to learn in situations where much of the data from record to record was common and few variables changed. In the example, name, address, and other personal information would not change from one record to a next record, so we really only need to learn the new variable values from each record.
The potential to save and use K location pointers during KStore learning was mentioned in our co-pending application Ser. No. 10/879,329, but it was not used to do so. No specific mechanism was configured at that time. We describe an implementation to accomplish reuse of K location pointers in this patent.