In the field of packet-switched communications, switching communications network nodes employ a store-and-forward discipline in processing packets conveyed. Storing packets at the switching network node is performed in accordance with a packet storage discipline and requires memory data storage.
The amount of storage memory required to store packets at a switching node is related to the network environment in which the switching node is employed.
For a switching node employed in a Local Area Network (LAN), typically the total bandwidth available to convey packets in and out of the switching node far exceeds the switching capacity of the switching node, therefore only a small amount of memory storage is required to store packets pending a processing response at the switching node.
By contrast, for access switching network nodes which aggregate uplink traffic and deaggregate downlink traffic, the speed mismatch between access ports (typically 100 Mbps or less) and the uplink ports (1 Gbps or more) often creates temporary congestion conditions at, and data accumulation in, the access switching nodes, requiring switching architecture designs supporting much larger memory stores to prevent packet drops.
Also, for a transport switching network node at the core of a Metropolitan Area Network (MAN), like that of Boston or New York City, the concurrent conveyance of millions of independent traffic flows causes random fluctuations in memory storage occupancy over time. A large packet memory storage is therefore required to withstand any erratic variations in content throughput.
In designing an single-chip switching network node architecture, a system architect takes into consideration the network environment in which the designed switching network node will ultimately be placed, and therefore knows approximately how much packet memory storage is needed. A single-chip switch network node having a large memory storage may cost an additional $30 or more. Therefore, it is essential that the single-chip switching node manufacturer offer solutions that specifically meet varied customers' needs, as opposed to providing a “one-size fits-all” solution because, the customer is just as likely to reject a switching node device with too much memory because of the price tag, as would be to reject a switching node device with too little memory because of a lack of performance.
Therefore, to remain competitive in multiple markets, a switching node system manufacturer typically provides at least two solutions: a switching node device with a small packet memory typically embedded on the single-chip switch, and another switching node device with a large packet memory connected externally. Although designing and fabricating chips for two switching node devices intended for different market segments is good in the sense that small packet storage memory issues are addressed in respect of the former solution only, and large packet storage memory issues are addressed in respect of the later solution only; providing two independent solutions incurs high development costs. The differential cost of fabricating two different switching nodes, instead of one, is high—currently estimated at an additional half of a million dollars. Additionally, the engineering costs for both pre-fabrication (layout, routing, etc.) and post-fabrication (validation) are also doubled.
Because of market drivers in the communications and semiconductor industries, switching node manufacturers are under pressure to cut costs wherever possible. One method of cutting cost in designing embedded packet switching nodes is to fabricate a single die supporting in addition to the switching logic, both, a small embedded memory store, and an option for connecting a large external memory store thereto. A few switching node manufacturers have already applied this “two switching chips, one die” technique in which both the small memory store switching node logic and the large memory storage switching node logic are co-manufactured on the same die.
FIG. 1 illustrates a typical two-chips/single-die switch architecture 100/200.
Accordingly, a first packet data flow for a generic packet switch 100 portion on the single-die employing a small internal memory store 102 (module) is shown. Packets are received via physical links 104 by a Media Access Control (MAC)/Gigabit MAC (GMAC) block 106, in accordance with the Institute of Electrical and Electronics Engineers (IEEE) 802.3 standard, specification which is incorporated herein by reference. A packet reception block 108 transfers the packet data from the MAC block 106 into the internal memory store 102 via an internal memory interface 103. The internal memory store 102 is a small, typically not greater than 1 MB; and fast, typically embedded Static Random Access Memory (SRAM).
Typical Ethernet packet frames vary in size between 64 and 1518 bytes. The internal memory store 102 is subdivided into smaller memory storage regions, referred to herein as granules. In most implementations, each granule is much smaller than the largest packet frame. Packets exceeding the length of a single granule are stored in multiple granules linked together using linking data structures tracked by a internal memory manager block 110. The internal memory manager block 110 maintains a list of free granules in the internal memory store 102, which are requested for use one-by-one by the packet reception block 108 as new packets arrive.
Once a valid packet has been fully written to the internal memory block 102, the packet is said to be pending processing and a packet processing job request is sent to a packet processing block 112. Simply put, the packet processing block 112 implements packet switching. The packet processing block 112 takes as input the first portion of the packet, known as the packet header, typically the first 128 bytes or less; extracts packet header information from packet header fields, and then uses the extracted information to determine the packet's processing priority and at least one destination port 106 via which the packet is to be transmitted out of the switching network node 100.
In addition, the packet may need to be modified based on the classification results; examples include source and/or destination address replacement, Type Of Service (TOS) reassignment, decrementing a packet's Time To Live (TTL) value, and checksum recalculation for the Internet Protocol (IP), the Transport Control Protocol (TCP), or the User Datagram Protocol (UDP). The packet modification function requires additional memory access 114 to the internal memory store 102.
After a packet has been classified, and a destination port determined, a queue manager 116 inserts a corresponding packet transmission job request into a correct forwarding queue—typically one queue per destination port and priority pair (not shown). Each packet is scheduled for transmission when the intended output port 106 is idle, and a job for the packet is waiting in one of that port's queues. Some packets may have to be broadcasted via multiple ports 106. In a switching network node 100 that guarantees Quality-of-Service (QoS), a scheduler (associated with the queue manager module 116) selects the next packet to be transmitted among waiting packet transmission jobs by applying a scheduling algorithm (described elsewhere) that takes into account factors such as packet forwarding priorities and the delay-sensitivity of the queued packets.
In transmitting a packet scheduled for transmission via a corresponding determined output port 106, the packet is retrieved from the internal memory store 102 by the packet transmission module 118 via the internal memory interface 103, and is transferred to the appropriate port MAC/GMAC 106 (or a CPU interface 120). When the entire packet has been transmitted over the physical link 104, the granules used to store the packet in the internal memory store 102 can be recycled by adding them to the list of free granules maintained by the internal memory manager block 110, for use in storing subsequent incoming packets.
FIG. 1 further illustrates a second packet data flow for a generic packet switch 200 portion on the die employing a large external memory storage 202. Conceptually, the packet data flows are nearly identical, except that interface 203 to the external memory storage 202 is used in storing received packets instead of the internal memory interface 103 to the internal memory store 102. Currently, Double Data Rate Synchronous Dynamic Random Access Memory (DDR SDRAM) is the preferred choice for large external memory storage 202 implementations, because the size-to-cost ratio is most favorable. Support for 128 MB of external memory storage is common in such a switching network node 200.
However, DDR SDRAM is not nearly as efficient as SRAM at transferring packet data. Therefore, a significant amount of jitter buffering at DDR SDRAM ingress and egress is needed to prevent overflows and underruns in the (G)MAC block 106. For this purpose, the SRAM memory block 102 is also employed in the large memory storage configuration/operation and is typically divided into two regions: a receive buffer 222 and a transmit buffer 224, for use in dejittering the DDR SDRAM 202.
It is important to re-emphasize that the architecture depicted in FIG. 1 actually reflects two applications of the single-die architecture. To support both modes of operation, all blocks 108, 110, 112, and 118 operating in accordance with internal memory mode logic, actually have corresponding blocks 208, 210, 212, and 218 operating in accordance with external memory mode logic. Both internal memory mode logic and external memory mode logic are implemented, as hardware logic during manufacturing, on the same die. The two modes of operation have entirely different implementations to enable access to the (internal memory store 102) SRAM and (external memory storage 202) DDR SDRAM memories, typically requiring: two different addressing schemes, two different granule sizes, two different sets of timing constraints, etc.
For example, when the external memory storage 202 is to be used, the packet reception block 108/208 must be able to transfer packet data into internal memory store SRAM 102 while operating in the internal memory mode for dejitter buffering, retrieve the packet data therefrom, and then transfer the packet data to external DDR SDRAM 202 via the external memory interface 203 while operating in the external memory mode. Blocks that must that implement both internal and external memory mode logic are “dual mode” blocks.
As nearly all blocks 108/208, 110/210, 112/212, and 118/218 support dual mode operation, such implementations suffer from: very high development costs, a complicated dual block design prone to errors, block-level verification takes at least twice as long to perform, system-level verification requires more effort, the die size is larger because of the multiple dual logic blocks, and the bigger die size reduces production yields.
There therefore is a need to solve the above mentioned issues.