The present invention generally relates to a system for locating data, and more specifically a search function for efficiently performing a plurality of memory read operations.
In a typical wireless local area network (WLAN) configuration, a portable or mobile device (e.g., a laptop personal computer) normally includes a HOST processor and a PCI card or PCMCIA card. On this card resides a Medium Access Control (MAC) processing system, a PHY (physical layer) processing device (e.g., a digital signal processor), and a main memory. The MAC processing system includes a MAC processor (e.g., an embedded processor), which is a multi-functional processor engine responsible for a variety of different processing tasks associated with the wireless communications. The PHY processing device performs such functions as encoding/decoding waveforms. Data transferred between the PHY processing device and the MAC processing system (i.e., the PHY data stream) may be encrypted using an encryption algorithm, including, but not limited to: RC4, DES (Data Encryption Standard) or AES (Advanced Encryption Standard). Consequently, encrypted data received by the MAC processing system from the PHY processing device must be subsequently decrypted.
Similarly, in the case of a data transmission from the MAC processor to the PHY data processing device, the data originates from the HOST processor that writes the data as plaintext to the main memory. The MAC processor will at a later time read the data from the main memory and encrypt it, using an encryption algorithm. Then, the encrypted data is transmitted to the PHY processing device.
It should be appreciated that encryption algorithms typically require the use of a private key to ensure data security. One of the significant time consuming steps in decrypting data is searching memory to look up a private key associated with an incoming receive address. Beyond encryption, there are other needs for a fast lookup based on a network station address including, but not limited to, formulation of response frame (e.g. Acknowledge), multicast hashing, and searching the client database for other necessary information. Often times, there are many clients communicating with a station thus the database may be quite large having thousands of entries.
Wireless communications protocols such as 802.11 are highly demanding on the throughput requirements of modern day systems. Commonly, ASICs with embedded CPUs are needed in order to provide enough computing power to meet the requirements. The primary problem at the MAC layer with meeting the 802.11 timing is the turn around time between frames. In accordance with 802.11, the time interval between frames is known as the xe2x80x9cinterframe spacexe2x80x9d (IFS). Four different IFSs are defined to provide priority levels for access to a wireless media (listed in order, from the shortest to the longest): (a) SIFS short interframe space; (b) PIFS PCF interframe space; (c) DIFS DCF interframe space; and (d) EIFS extended interframe space.
In a receive scenario, where a communication device, such as a station, receives back-to-back encrypted frames, meeting the 10 xcexcs SIFS (xe2x80x9cshort interframe spacexe2x80x9d) time requirement is difficult. This is due to the need to finish processing of the decrypted frame before the next one arrives. Furthermore, with the advent of the 802.11A protocol, the SIFS time is even less, typically 6-8 xcexcs due to PHY delay associated with OFDM latency. FIG. 1 illustrates a timing scenario.
One of the primary delays in this system is the time it takes to lookup the private key for decrypting the encrypted frames. After receiving a xe2x80x9creceive addressxe2x80x9d in a MAC header, a key lookup procedure can commence. This can take a significant amount of time depending on how many keys the station has to search through to find the correct key. If the station is an access point, there could be thousands of MAC addresses to search through before finding the one having the correct key. After finding the private key, the decrypt algorithm initialization can commence, and then decryption of the frame can start. This may spill over into the time to process a subsequent frame. If private key lookup (based on MAC receive address) and decryption takes too long, the station will not complete the entire decryption process before the next packet arrives, causing the new packet to be dropped.
Another problem requiring fast data lookup is the ability to quickly respond to an incoming frame and send an acknowledgement back to the sender. Thus, a transmit after receive also requires tight timing. With 802.11 protocols an incoming reception often requires a send back to the origin node to acknowledge receipt of the message.
The receiving station also has turn around time issues with looking up the incoming address to decide if the message is coming from an authenticated and associated client. If so, further decoding of the message might be needed to decide if an acknowledgement is required or not. This all must be done before the SIFS interval has elapsed, hence the time to search the client address database might be critical.
In order to allow for speedy lookup, systems often use one of two approaches. A software-based approach known as xe2x80x9chashingxe2x80x9d can be used to calculate random starting search address based on a MAC receive address. This hash function will hopefully generate a suitable random distribution of data, so that the search space can be reduced. A perfectly done hash will splice the search out into a number of xe2x80x9cbucketsxe2x80x9d (i.e., subsets) each of which contains an equal number of entries corresponding to the hash address. Hence, for a search of 100 items with 10 buckets, each would ideally have 10 entries. With hashing, one just needs to calculate the hash to find the proper bucket, and search these smaller lists to find the target address. FIG. 2 illustrates the hashing procedure as described above.
The foregoing approach has drawbacks relating to the following: (a) the amount of time needed to calculate a good hash function, (b) the management of the hash table, and (c) a bad hash can impair performance, almost as bad as a straight linear search.
A hardware-based approach referred to as CAM (Content Addressable Memory) is often used to greatly speed up the search process. A is CAM is given content as an input (e.g., a unique data tag, such as a MAC rx address ID) and the output of the CAM provides the target (in this case another address pointer to the corresponding key) after a short delay. While performance can be quite good using CAM, there are several disadvantages to this approach as well: expensive memories, the time needed to load and manage the CAM, and the limits on CAM size (i.e., it may not be big enough to hold all key data without huge cost).
The present invention addresses these and other drawbacks of the prior art.
According to the present invention there is provided a method of operating a processor to determine the location of one or more tags respectively stored in one of a plurality of memory locations, comprising: fetching and executing a SEARCH instruction to execute an associated memory read instruction a plurality of times; fetching the associated memory read instruction to read the contents of one or more memory locations; loading a count register with a count value indicative of a maximum number of memory locations to be searched; and executing the associated memory read instruction one or more times to respectively read the contents of the one or more memory locations, wherein the associated memory read instruction is executed as many times as indicated by the count value, or until a search hit occurs.
According to another aspect of the present invention there is provided a processor for determining the location of one or more tags respectively stored in one of a plurality of memory locations, comprising: means for loading a count register with a count value indicative of a number of memory locations to be searched; means for fetching a SEARCH instruction; means for executing the SEARCH instruction to execute an associated memory read instruction a plurality of times; means for fetching the associated memory read instruction to read the contents of one or more memory locations; and means for executing the associated memory read instruction one or more times to respectively read the contents of the one or more memory locations, wherein the associated memory read instruction is executed as many times as indicated by the count value, or until a search hit occurs.
According to still another aspect of the present invention there is provided a processor for determining the location of one or more tags respectively stored in one of a plurality of memory locations, said processor comprising: a memory address register associated with a main memory; a memory data register associated with the main memory; a memory control for generating memory control signals; a program counter for storing a memory address location of the main memory where an instruction is to be fetched; an instruction register for storing an instruction that is to be executed; at least one general purpose register; decode and execute control logic for decoding and executing an instruction stored in the instruction register; and a state machine for controlling the fetching of a memory read instruction, and the execution thereof one or more times to respectively determine the contents of one or more memory locations.
An advantage of the present invention is the provision of a search function which provides fast and efficient searching of a plurality of memory locations.
Another advantage of the present invention is the provision of a search function which provides for fast and efficient locating of data.
Still another advantage of the present invention is the provision of a search function that can be conveniently added to an instruction and register set of a CPU (central processing unit).
Still another advantage of the present invention is the provision of a search function that minimizes the number of instructions cycles need to search a plurality of memory locations.
Yet another advantage of the present invention is the provision of a search function that is flexible and easy to use.
Still other advantages of the invention will become apparent to those skilled in the art upon a reading and understanding of the following detailed description, accompanying drawings and appended claims.