1. Field of the Invention
This invention relates to content addressable memories (CAMs) and, more particularly, to an improved CAM architecture with substantially fewer functional failures, less power consumption, and improved timing.
2. Description of the Related Art
The following descriptions and examples are given as background only.
Content Addressable Memory (CAM) devices are typically used in applications that require the capability to quickly search for patterns stored among a group of memory cells. Just like other memory devices, a CAM is arranged as an array of memory cells, each cell being capable of storing an electric charge. These charges can be manipulated to store and recall data through the use of special control circuitry in the memory. In some cases, each memory cell may be capable of storing one of two values, depending on whether or not the cell holds a charge. The values “1” and “0” are typically used to represent the charged and uncharged states, respectively, although the opposite may be true. CAM devices configured for storing one of two possible states (i.e., a 1 or a 0) in each cell are known as “Binary CAMs.” In addition to storing binary data, Ternary CAMs (or TCAMs) are configured for storing a mask bit, which may otherwise be referred to as a “wild card” or “don't care” bit. One or more mask bits may be used in a compare operation to search for a pattern having one or more unknown or “don't care” bit positions in the pattern. TCAMs are typically used in applications that benefit from increased flexibility.
In addition to a mechanism for storing data, each CAM cell may include a comparison circuit having a “compare” input signal and a “match” output signal. Typically, a column of cells share the same compare signal, while each cell outputs its own match signal. When a binary value or ternary value drives the compare input, each memory cell having the same value as the compare input activates (or “asserts”) its match signal. This single-cell matching mechanism is useful for quickly finding data patterns among a plurality of cells in the memory device.
FIG. 1 illustrates an exemplary Binary CAM 100 including an array of memory cells 110, 120, 130, 140, 150 and 160. In FIG. 1, the CAM cells are organized into two rows and three columns; however, the CAM cells may be organized into substantially any other number of rows and columns, as desired. In the embodiment shown, CAM cells 110-130 represent the three columns of row 0, and CAM cells 140-160 represent the three columns of row 1. Each cell includes storage logic (S) and comparison logic (C). The storage logic (S) stores the charge that identifies the data value. Each row of cells is coupled for receiving a word line (WL) signal that activates the storage logic of each cell for reading and writing data. The CAM cells 110-130 of row 0, for example, are coupled to word line WL0, while CAM cells 140-160 of row 1 are coupled to word line WL1.
The storage logic (S) in each cell is coupled for receiving complementary bit line signals, which are shared with other cells in the same column. For example, CAM cells 110 and 140 of column 0 are coupled to complementary bit lines BL0 and BLB0, CAM cells 120 and 150 of column 1 are coupled to complementary bit lines BL1 and BLB1, and CAM cells 130 and 160 of column 2 are coupled to complementary bit lines BL2 and BLB2. To read data from a particular cell, the word line coupled to that cell is asserted, causing the cell to dump data from the storage logic onto the bit lines. To write a data value into a particular cell, the data value is placed onto the bit lines coupled to the cell. Activating the cell's word line then causes the cell to store the data value from the bit lines into the storage logic.
The comparison logic (C) in each cell is coupled to the storage logic and to a pair of complementary compare lines, which are shared with other cells in the same column. For example, CAM cells 110 and 140 of column 0 are coupled to compare lines CL0 and CLB0, CAM cells 120 and 150 of column 1 are coupled to compare lines CL1 and CLB1, and CAM cells 130 and 160 of column 2 are coupled to compare lines CL2 and CLB2. In each cell, the comparison logic generates a match line (ML) signal based on the data value stored within the storage logic and the compare value supplied to the compare lines. For example, a MATCH or HIT signal may be generated if the compare value matches the stored data value; otherwise, a NO MATCH or MISS signal may be generated.
Memory cells are usually accessed in groups called “words.” Memory words comprise at least two contiguous cells on the same row and share a common word line, and in some cases, a common match line. The memory array shown in FIG. 1, for example, is constructed using three-bit words, including a first word consisting of cells 110-130 and a second word consisting of cells 140-160. The individual “bit match line signals” generated within each memory cell are supplied to a common match line (e.g., ML0 for row 0, or ML1 for row 1) to generate a match line signal for the entire word (referred to herein as a “word match line signal”). A “HIT” signal may be generated for the entire word, if the compare bit pattern exactly matches the sequence of bits in the data word. However, if at least one compare bit fails to match a respective data bit, a MISS signal will be generated for the entire word.
FIGS. 2 and 3 illustrate conventional Binary and Ternary CAM cell architectures, respectively. As shown in FIG. 2, Binary CAM cell 200 includes storage logic 210, comparison logic 220, a word line (WL), a match line (ML), a pair of complementary bit lines (BL/BLB) and a pair of complementary compare lines (CL/CLB). The storage logic 210 includes a storage cell, or bi-stable latch, implemented with cross-coupled p-channel load transistors (P1 and P2) and n-channel latch transistors (N1 and N2). A pair of n-channel access transistors (N3 and N4) provide access to the storage nodes (D/DB) of the bi-stable latch. The comparison logic 220 includes a pair of n-channel access transistors (N5 and N6) and an n-channel match detect transistor (N7). The source-drain path of match detect transistor (N7) is coupled between the match line (ML) and ground, while the gate terminal of transistor N7 is coupled to comparison node C.
In some cases, memory cell 200 may be accessed by applying a positive voltage to the wordline (often referred to as “raising the wordline”), which activates access transistors N3 and N4. This may enable one of the two bit lines (BL/BLB) to sense the contents of the memory cell based on the voltages present at the storage nodes. For example, if storage node D is at a relatively high voltage (e.g., logic 1) and node DB is at a relatively low voltage (e.g., logic 0) when the wordline is raised, latch transistor N1 and access transistor N3 are activated to pull the bit line complement (BLB) down toward the ground potential. At the same time, the bit line (BL) is pulled up by activation of latch transistor N2 and access transistor N4. In this manner, the state of the memory cell (either a 1 or 0) can be determined (or “read”) by sensing the potential difference between bit lines BL and BLB. Conversely, writing a 1 or 0 into the memory cell can be accomplished by forcing the bit line or bit line complement to either VDD or VSS and then raising the wordline. The potentials placed on the pair of bit lines will be transferred to respective storage nodes, thereby forcing the cell into either a logic 1 or 0 state.
During a compare operation, the data values stored at nodes D/DB are compared with compare values supplied to the pair of complementary compare lines (CL/CLB) for generating a match line signal (e.g., a HIT or MISS). In most cases, the match line (ML) is precharged to a logic 1 state (indicating a HIT) before the compare operation begins. If a HIT occurs during the compare operation (e.g., if the data values at node D and compare line CL are both logic 0 or both logic 1), comparison node C will be pulled to a logic 0 value, allowing the match line to remain at the precharged logic 1 level. However, if a MISS occurs (i.e., if the data values at node D and compare line CL are different), comparison node C will be logic 1 and the match line will be discharged to a logic 0 level. If the match line is shared by multiple bits, as shown in FIG. 1, the match line will be discharged to a logic 0 level if at least one bit misses.
As shown in FIG. 3, Ternary CAM cell 300 includes many of the same circuit elements included within CAM cell 200 of FIG. 2. For example, CAM cell 300 includes storage logic 310, comparison logic 320, a word line (WL), a match line (ML), a pair of complementary bit lines (BL/BLB) and a pair of complementary compare lines (CL/CLB). Storage logic 310 is similar to storage logic 210; therefore, description of storage logic 310 will not be repeated for purposes of brevity. In addition to the circuit elements included within comparison logic 220, comparison logic 320 includes n-channel transistors (N8 and N9). The source-drain paths of transistors N8 and N9 are coupled in series between comparison node C and ground. The match detect transistor N7 is activated or deactivated by the voltage present at the node (X) arranged between transistors N8 and N9.
Ternary CAM cell 300 includes another storage portion 330 for storing a complementary mask bit (M/MB) received from a pair of complementary mask lines (MASK/MASKB). When the mask bit (M) is set to logic 0 (i.e., not masked), transistor N9 will be turned off and transistor N8 will be turned on. This enables compare logic 320 to operate in a manner similar to compare logic 220. However, setting the mask bit (M) to logic 1 (i.e., masked) causes transistor N8 to turn off and transistor N9 to turn on. Activation of transistor N9 pulls node X down to ground, turning off match detect transistor N7 and maintaining the match line at it's preset voltage level to indicate a HIT (regardless of the comparison value at node C).
The CAM cell architectures shown in FIGS. 2 and 3 present many problems, especially in newer technologies with lower power supplies and diminished transistor sizes. First of all, CAM cells 200 and 300 both use n-channel transistors (e.g., NMOS) in the comparison logic portion of the cell. Because NMOS is not particularly good at passing logic 1 values, the voltage at comparison node C is not full VDD when it should be a logic 1. Instead, the voltage at comparison node C is degraded to VDD-Vth (in FIG. 2), where Vth is the threshold voltage of NMOS transistor (N5 or N6). This means that, in the case of a MISS, the match detect transistor (N7) will not be fully turned on. In some cases, the degraded voltage at node C may cause functional failures, which occur when a compare operation is miss-sensed (i.e., sensed incorrectly) as either a HIT or a MISS.
The performance of memory cell 200 may further decrease over certain process, voltage and temperature (PVT) comers (especially when the temperature is low), and is often unacceptable in newer transistor technologies. For example, although the performance of memory cell 200 may be acceptable when using 0.18 micrometer (μm) technology, the performance steadily decreases in newer technologies utilizing lower power supplies and smaller gate lengths (e.g., 0.13 μm and below). Because the transistor threshold voltage (Vth) does not drop proportionately to VDD, the threshold voltage becomes a larger percentage of VDD in newer technologies, which increases the occurrence of functional failures. The problem is exasperated in TCAM cell 300, where a series connection of two NMOS transistors further degrades the compare node C voltage to VDD−2*Vth.
Another problem with memory cells 200 and 300 is that they tend to consume a relatively large amount of power. As shown in FIG. 4, for example, each match line (ML) must be precharged to a predetermined voltage level (usually a logic high voltage level, VH) before the compare operation begins. In most cases, the predetermined voltage level represents a “HIT” state, although the opposite may be true, in some cases. When the compare operation begins (at time t0), the match line voltage for a given row may be discharged to a logic low voltage level (VL) if at least one bit misses in the row. Near the end of the compare operation (at time t1), the match line voltage is restored to the predetermined voltage level (VH) in preparation for the next comparison cycle (which may begin at time t2). A CAM device operates by searching all rows in parallel. Misses are much more frequent than Hits. Therefore, the architectures shown in FIGS. 2 and 3 consume a considerable amount of power by continually charging, discharging and recharging the match lines after each and every MISS.
In addition to increased power consumption, the need for restoring the match line to a predetermined state (after a MISS) complicates the design of the memory device and decreases the speed with which it operates. For example, a self-timing path is usually needed to determine when the compare operation is finished, so that the restore operation can begin. Since it takes time to perform the restore operation, timing margins are also required to account for the restore time. This undesirably increases the amount of delay within a critical path.
Additional problems arise in newer technologies as gate lengths are decreased to about 0.13 μm and below. As gate lengths get shorter, current leakage (e.g., IDS standby current) becomes a larger percentage of the total transistor current consumption. For example, although the current leakage may be small in the 0.18 μm technology used to form memory cells 200 and 300, it becomes a bigger problem as gate lengths decrease to 0.13 μm, and often becomes unmanageable in technologies 90 nm and below. When dynamic logic is used within the compare portion of the memory cell, as in the case of memory cells 200 and 300, more and more functional failures tend to occur as current leakage increases.
For example, a PMOS load transistor (not shown) is used in the memory cells of FIGS. 2 and 3 to precharge the match line during a precharge phase. During compare (i.e., evaluation) modes, the PMOS load transistor is turned off and NMOS transistors (e.g., N5, N6 and N7 in FIG. 2) are used to determine the state of the match line. If the comparison node is pulled to logic “0” (when a HIT occurs), transistor N7 is turned off to ensure that the match line remains at the precharged voltage level. Because the match line is not actively driven during the evaluation phase (i.e., because the PMOS precharge transistor is turned off), current leakage within transistor N7 may cause the match line voltage to decrease over time. In some cases, excessive leakage (or excessive PMOS off time) may cause the match line to be pulled down to a MISS when it should be a HIT.
One solution to the current leakage problem is to insert regeneration (“Regen”) cells along the match line (ML), as shown in FIGS. 5 and 6. The Regen cells are usually spaced along the match line between every N memory cells (e.g., where N=16, in the embodiment of FIG. 6). The Regen cells are used as “match line repeaters” to buffer the match line (ML) signal, since one memory cell usually cannot pull down the match line with all bits tied to it. The Regen cells are also used to reduce leakage along the match line by dividing the match line into “local match lines,” each tied to only a portion of the total number of bits. Although the solution shown in FIGS. 5 and 6 may suffice in some technologies (e.g., down to about 0.13 μm the solution presents additional problems in newer technologies (e.g., 90 nm and below) where leakage constitutes a greater portion of the total transistor current consumption. For example, as gate lengths decrease, the number of Regen cells inserted along a match line can be increased to decrease the number of cells tied to each local match line. Although this may decrease the total leakage current, increasing the number of Regen cells increases the amount of area consumed by the CAM device and adds delay to the critical path.
Therefore, a need exists for an improved CAM architecture and method of operating the same. Preferably, the improved architecture and method would improve timing and reduce power consumption, the occurrence of functional failures and the complexity of the CAM device. It is also preferred that the architecture and method be applicable to a wide range of technologies, including those 90 nm and below, without sacrificing accuracy.