In a typical wireless local area network (WLAN) configuration, a portable or mobile device (e.g., a laptop personal computer) normally includes a HOST processor and a PCI card or PCMCIA card. On this card resides a Medium Access Control (MAC) processing system, a PHY (physical layer) processing device (e.g., a digital signal processor), and a main memory. The MAC processing system includes a MAC processor (e.g., an embedded processor), which is a multi-functional processor engine responsible for a variety of different processing tasks associated with the wireless communications. The PHY processing device performs such functions as encoding/decoding waveforms. For privacy, data transferred between the PHY processing device and the MAC processing system (i.e., the PHY data stream) may be encrypted using an encryption algorithm, including, but not limited to: RC4, DES (Data Encryption Standard) and AES (Advanced Encryption Standard). Consequently, encrypted data received by the MAC processing system from the PHY processing device must be subsequently decrypted.
Similarly, in the case of a data transmission from the MAC processor to the PHY data processing device, the data originates from the HOST processor that writes the data as plaintext to the main memory. The MAC processor will at a later time read the data from the main memory and encrypt it, using an encryption algorithm. Then, the encrypted data is transmitted to the PHY processing device.
It should be appreciated that encryption algorithms typically require the use of a private key to ensure data security. One of the significant time consuming steps in decrypting data is searching memory to look up a private key associated with an incoming receive address and then initializing a table (using this private key) for use in a decryption algorithm (e.g., RC4). Often times, there are many clients communicating with a station, thus the database may be quite large having thousands of entries each with possibly different keys.
It is widely recognized that wireless communications protocols, such as IEEE 802.11, are highly demanding on the throughput requirements of modern day systems. Commonly, application specific integrated circuits (ASICS) with embedded processors are needed in order to provide enough computing power to meet the requirements. One problem at the MAC layer with meeting the IEEE 802.11 timing requirements is the turn around time between frames. In accordance with IEEE 802.11, the time interval between frames is known as the “interframe space” (IFS). Four different IFSs are defined to provide priority levels for access to a wireless media (listed in order, from the shortest to the longest): (a) SIFS short interframe space; (b) PIFS PCF interframe space; (c) DIFS DCF interframe space; and (d) EIFS extended interframe space.
In a receive scenario, where a communication device, such as a station or access point, receives back-to-back encrypted frames, meeting the 10 μs SIFS (“short interframe space”) time requirement is difficult. This is due to the need to finish processing of the decrypted frame before the next one arrives. Furthermore, with the advent of the 802.11a protocol, the SIFS time is even less, typically 6–8 microseconds due to PHY delay associated with OFDM latency.
FIG. 1 illustrates an exemplary data stream from the PHY to the MAC processor. A typical data stream includes a plurality of frames (also referred to as “packets) separated by an interframe gap. Each frame is a collection of bytes that comprise a single message being transferred between the PHY and MAC processor. A typical frame may include a header (including an initialization vector (IV) and an integrity check vector (ICV)), one to several thousand bytes of encrypted data (also referred to as “ciphertext”), checksum data, as well as other information. The contents of a frame is referred to herein as the “frame content.”
In the exemplary embodiment, the IV portion of the header is used to partition frames between the header and the encrypted ciphertext, and also provides a portion of a lookup key for decryption. In this regard, the IV portion includes 3 bytes of a 16 byte private key, which is needed before initialization of a decryption state table (RC4) can commence. Therefore, the decryption processing cannot begin until the IV portion of the frame has arrived. The ICV portion is used to authenticate the source of received data. It should be understood that the time to load the key and prepare the encryption table is unaffected by the frame size. Thus, it will take about the same amount of time (e.g., 20 microseconds) to initialize an RC4 state table, regardless of the size of the frame. Moreover, the time for performing a key lookup is independent of the frame size.
A worst case timing scenario occurs when a short frame (i.e., a frame having relatively few bytes (e.g., 10–20 bytes of data) arrives at a station followed by another frame. If the key lookup and decryption processing take too long, a station will be unable to complete decryption processing before the next packet arrives. A lag effect will always occur where the receiving memory storage device accumulates ciphertext bytes before they are decrypted following complete initialization of a decryption table (RC4). If this lag effect is significant, an “overrun” situation may arise where the FIFO will completely fill up causing received data to be lost.
In order to meet the timing demands, prior art systems have commonly employed FIFO memories to queue up and receive a frame as it arrives so it can be processed later. FIFOs are also referred to herein as queues. It should be appreciated that this technique may only delay an overrun situation. In this regard, it is necessary for the “traffic density” of a queuing system (also known as a “queuing network”) to be less than 1, in order for it to be mathematically stable. Traffic density is defined as: “arrival rate into the queue” divided by “departure rate out of the queue.” Thus, without decryption throughput to process the queue comparable to the PHY data throughput of the arriving data, an overrun situation is inevitable as more and more packets arrive.
Another prior art approach has been to offload the decryption processing for a later time. In this regard, decryption processing becomes decoupled from a receive operation. Thus, a receive operation will only receive ciphertext, store it into memory, and leave decryption for later. However, this approach requires extra firmware processing and buffer space to store several frames of data. Also, since the frames are not processed in real time, it will take longer to authenticate the data and then offload it to the host. This long latency time is not desirable for applications related to quality of service (e.g., streaming voice and video) where it is important to minimize packet processing latency between the PHY and Host.
The present invention addresses these and other drawbacks of the prior art.