Many different types of computing systems have attained widespread use around the world. These computing systems include personal computers, servers, mainframes and a wide variety of stand-alone and embedded computing devices. Sprawling client-server systems exist, with applications and information spread across many PC networks, mainframes and minicomputers. In a distributed system connected by networks, a user may access many application programs, databases, network systems, operating systems and mainframe applications. Computers provide individuals and businesses with a host of software applications including word processing, spreadsheet, accounting, e-mail, voice over Internet protocol telecommunications, and facsimile.
In today's networked world, bandwidth is a critical resource. Very high network traffic, driven by the Internet and other emerging applications, is straining the capacity of network infrastructures. To keep pace, organizations are looking for better ways to support and manage traffic growth and the convergence of voice with data. Today's dramatic increase in network traffic can be attributed to the popularity of the Internet, a growing need for remote access to information, and emerging applications. The Internet alone, with its explosive growth in e-commerce, has placed a sometimes insupportable load on network backbones. The growing demands of remote access applications, including e-mail, database access, and file transfer, are further straining networks.
Eliminating network bottlenecks continues to be a top priority for service providers. Routers are often the source of these bottlenecks. However, network congestion in general is often misdiagnosed as a bandwidth problem and is addressed by seeking higher-bandwidth solutions. Today, manufacturers are recognizing this difficulty. They are turning to network processor technologies to manage bandwidth resources more efficiently and to provide the advanced data services, at wire speed, that are commonly found in routers and network application servers. These services include load balancing, QoS, gateways, fire walls, security, and web caching.
A Network Processor (NP) may be defined as a programmable communications integrated circuit capable of performing one or more of the following functions:                Packet classification—identifying a packet based on known characteristics, such as address or protocol        Packet modification—modifying the packet to comply with IP, ATM, or other protocols (for example, updating the time-to-live field in the header for IP)        Queue/policy management—reflects the design strategy for packet queuing, de-queuing, and scheduling of packets for specific applications        Packet forwarding—transmission and receipt of data over the switch fabric and forwarding or routing the packet to the appropriate address        
Although this definition accurately describes the basic features of early NPs, the full potential capabilities and benefits of NPs are yet to be realized. Network processors can increase bandwidth and solve latency problems in a broad range of applications by allowing networking tasks previously handled in software to be executed in hardware. In addition, NPs can provide speed improvements through certain architectures, such as parallel distributed processing and pipeline processing designs. These capabilities can enable efficient search engines, increase throughput, and provide rapid execution of complex tasks.
Network processors are expected to become the fundamental network building block for networks in the same fashion that CPUs are for PCs. Typical capabilities offered by an NP are real-time processing, security, store and forward, switch fabric, and IP packet handling and learning capabilities. The processor-model NP incorporates multiple general purpose processors and specialized logic. Suppliers are turning to this design to provide scalable, flexible solutions that can accommodate change in a timely and cost-effective fashion. A processor-model NP allows distributed processing at lower levels of integration, providing higher throughput, flexibility and control. Programmability can enable easy migration to new protocols and technologies, without requiring new ASIC designs.
A network processor comprises a data flow unit to handle the movement of data at a network node. To keep pace with the speed of packet transmission, the network processor must implement data buffering at a 40 ns frame recurrence rate for a SONET link operating at 9.95328 Giga-bits-per-second (Gbps). Buffering of large quantities of data calls for a large data store that is implemented in DRAM (Dynamic Random Access Memory.) A large data store calls for a large control store to maintain information about each packet of data handled by the data flow unit. This information includes, packet size, location in the data store, etc. The information for a packet is organized into a Frame Control Block (FCB) and the frame control blocks are stored in the control store. Because DRAM is inexpensive relative to the cost of higher speed memory such as SRAM (Static RAM), implementation of the control store in DRAM is desirable. Also, implementation of the control store in the same type of memory that implements the data store allows scalability of buffering, since the control store size is generally proportional to data store size.
However, the relatively long memory access time for DRAM interferes with the performance of some network processing functions. When a packet is received its FCB is placed in a queue corresponding to the flow of packets to which it belongs. Thus, there is a queue of FCBs corresponding to a queue of packets. The FCBs are stored in a linked list format. Each FCB has the pointer to the next FCB in the list, thereby forming a chain. When a packet is to be transmitted, the data flow unit reads the FCB of the packet from the control store and gets the address of the next FCB in the chain. For a control store implemented in DRAM, this typically takes longer than the 40 ns packet rate. Thus, there is a need for systems and methods to reduce long latency accesses to a control store implemented in DRAM or similar cost effective memory.