For purposes of the present discussion, RAM devices may be divided into at least two general classes based on intended applications and cost/performance tradeoffs.
A first class (type one RAM) is comprised of devices whose design is optimized for high-density and access to large amounts of contiguous data, while a second class (type two RAM) is comprised of devices whose design is optimized for random access to small amounts of data that may be discontiguous within the total address space of the memory.
An example of type one RAM is Dynamic RAM (DRAM), which by definition includes, but is not limited to, Synchronous DRAM (SDRAM) and Double Data Rate Synchronous DRAM (DDR-SDRAM). Type one RAM memory cells may be packed relatively densely, so the large quantity of data that can be stored in such devices allows the cost per data unit stored to be minimized. Such devices are a typical choice for providing large amounts of memory in systems that require this. Since the performance of most such systems benefit from rapid access to large contiguous blocks of data, the designs are optimized to enable this, at the cost of providing relatively slower access to small blocks of discontiguous data. Such a design tradeoff is often appropriate because many business, scientific, engineering and graphics data processing applications have the characteristic of operating on relatively large blocks of contiguous data.
Static RAM (SRAM) is one example of type two RAM. Type two RAM memory cells cannot be packed as densely as type one RAM memory cells and dissipate more power than type one RAM memory cells. The consequence of the relatively low packing density and the higher power of type two RAM is that the quantity of data that can be stored is lower than type one RAM devices would provide and a higher cost per unit data stored. Current design practice is to accept this higher cost in order to gain uniformly low access latency over the total address space of the memory.
Certain data processing applications such as networking components inevitably need to operate on discontiguous data. The current design practice yields acceptable cost-effectiveness provided the quantity of memory which must be provided is relatively low, since the aggregate of the higher cost per data unit of the memory remains a low portion of the total system cost. But for systems requiring large amounts of memory, type two RAM can be infeasible due to cost, and the high power consumption and low density of type two RAM can create heat dissipation and physical size problems. The growing processing and memory needs of networking components provide one example of this situation.
Network infrastructure speeds have increased dramatically, often generation-to-generation being 10× in throughput from the previous. Historically the infrastructure itself only required the information related to routing or other transient data/statistics to be maintained in the wire speed equipment. The servers themselves or other general purpose CPUs in equipment were responsible for the processing of persistent state such as TCP, UDP, IPSec or SSL connection information.
General purpose CPUs with traditional memory systems or even specialized processors for routing (i.e., stand-alone Network Processors) do not have the memory subsystems to handle both the high-data-throughput and the high-simultaneous-connection specifications required. The aggregation of services at the edge of a data center can require one million or more TCP connections for an application such as SSL or similarly 500,000+ security associations for IPSec. Firewalls, load balancers, etc. could also be enhanced if there were a capability to either terminate or shadow TCP connections at wire speeds. A “shadow TCP connection” is one that does not terminate the TCP connection, but maintains state with the connection so as to monitor the terminated TCP connection. It would be valuable to provide sufficient memory to support such tasks, but they inherently need to access small blocks of discontiguous data. The cost of providing adequate amounts of suitable memory using existing design precepts can make such systems infeasible due to total cost.
In light of the above discussion, it would be desirable to provide a memory architecture and strategy that enabled the use of the high-density, low power and low cost devices such as type one RAM, while providing adequately low latency in accessing small blocks of discontiguous data. This disclosure provides such an architecture and strategy. These and other advantages, as well as additional inventive features, will be apparent from the present disclosure.