1. Field of the Invention
This invention relates a method and apparatus for high speed data searching including buffered data searching.
2. Background Art
Information can be manipulated using a computer system. Information is stored in a computer system in the form of binary digits (bits). Typically, information is stored on a medium such as a hard disk drive or floppy disk drive and read into the memory of a computer system for access by the system's central processing unit (CPU). The content of the information can be searched using the resources (e.g., central processing unit time, memory, etc.) of the computer system. Efficiencies in searching technique can reduce the resources expended for a search.
One example of an application that searches information is compression. A compression application searches information to generate a new, more compact version of the information. Compression can be used to reduce the amount of memory needed to store information in a computer system. Information can be compressed prior to its transmittal to another computer system thereby reducing the time needed to transmit the information. Further, compression can be used to optimize the size of information thereby optimizing the transmission of information between computer systems (e.g., modem communications and downloading Internet graphics, etc.). By increasing the efficiency of compression, a computer system's finite set of resources can be further optimized.
In one compression scheme, referred to as run-length encoding, the compression mechanism searches information to find a repeating character sequence that can be replaced by one occurrence of the character and a number, x, that reflects the number of times the character is to be repeated. The string is decoded by inserting x occurrences of the character in the string.
A second type of string compression mechanism creates a dictionary that contains entries derived from the information. Entries in the dictionary can be used to encode subsequent string input. Each entry in the dictionary is a substring from the string input. As a substring of the string input is being processed, the dictionary is searched to determine whether the substring matches a substring contained in a dictionary entry. If a match is found, it is assumed that the substring occurred previously in the string input. Thus, the substring can be replaced in the compressed representation (i.e., the output of the encoding process) with a pointer to the previous occurrence. If it is determined that the substring is unique, it is copied into the compressed representation and is added to the dictionary and can be used in comparisons with subsequent substrings in the string input. A product, MagnaRam.TM., is available from Quarterdeck Corporation that creates a dictionary from a data buffer for the purpose of string compression.
Existing compression mechanisms can be used to optimize a computer system's finite set of resources. Compression applications, as well as other applications, need to search information stored in a computer system. It is, therefore, advantageous to increase efficiency to reduce the amount of finite resources that are used in searching. It is beneficial to develop more efficient searching to make applications that perform searching more efficient.