The present invention relates to a memory apparatus and method of a type which can be useful for devices used in multicasting and in particular to a memory configuration for a network or other switch or similar device which can provide data to two or more output ports.
Many systems and devices are configured to impart the capability of multicasting in which a data stream or other signal is received and is output to two or more output ports. One example is a network switch which typically has multiple input ports and multiple output ports. In many situations, it is desired to provide a signal received on one of the input ports to two or more of the output ports, thus providing a form of multicasting.
Typically, multicasting devices, including switches, may include memory which stores some or all of the data received on the input ports before the data is then output through the output ports. Such data can provide useful buffering when it is desired to, for example, store portions of the data destined for a particular output port (or ports) during the time that other data is being output to different output ports. Such buffering memory can also be useful for accommodating varying data rates and the like.
Characteristics of such memory can affect the cost and/or performance of the switch or similar device. Other factors being equal, the cost of the switch will increase as the amount of memory provided in the switch increases. The amount of memory which is provided is a function not only of the characteristics of the data (and the data rates) to be received and transmitted, but is also a function of how efficiently the memory is used (for example, whether the memory is configured in such a way that it is necessary to provide a relatively large amount of memory to accommodate situations that may only rarely occur, such that during the majority of the time the device is being used, most of the memory may be superfluous). Cost and performance can also be affected by bandwidth of the memory and/or related components. Cost and/or performance can be adversely impacted if a particular memory configuration means that a relatively large bandwidth is required to perform memory reads or writes. Bandwidth considerations have been particularly problematic in systems which, as has been most typically the situation, the entire memory system is relatively less-integrated, such as when a memory system is provided by coupling (typically commoditized) memory arrays (provided on one or more discrete chips) to certain control (or other) circuitry, often on other chips, requiring buses or similar devices for carrying the bandwidth necessary to achieve read and write operations.
In typical previous systems, certain factors such as memory size and bandwidth requirements were at least partially in opposition, in the sense that systems which were configured for efficiency of memory use (thus requiring smaller memory arrays) typically achieved this benefit at the price of a relatively higher bandwidth (for at least some components of the system). Accordingly, it would be useful to provide a memory configuration which can achieve relatively high efficiency (i.e. avoid the need for memory which may be unused much of the time) without increasing bandwidth requirements to an undesirable or unacceptable level.
The present invention provides for a memory configuration with relatively high usage efficiency while avoiding a substantial increase bandwidth requirements. In one embodiment, the memory array is subdivided into subarrays, each subarray being individually addressable and controllable (such as read/write controllable). Such a configuration facilitates parallel writes such as storing two copies of a frame (or other data portion), e.g. in two separate subarrays when the frame is destined for two output ports. The subarrays, in this way, can be associated with the various output ports, preferably in a dynamic fashion so that the total available amount of memory can be allocated to the various output ports as needed. This provides enhanced usage efficiency since there is no need to allocate the maximum anticipated required memory for each of the output ports. Rather, a xe2x80x9cpoolxe2x80x9d of memory subarrays is provided sufficient to accommodate the anticipated memory requirements of the system as a whole (rather than for each output port) and subarrays can be dynamically allocated from the pool as needed.
Furthermore, the size of the subarrays can be selected different from frame sizes, if desired. This provides the ability to allocate memory with a granularity different from frame sizes (e.g. to allocate subarrays large enough to hold several frames) which reduces the size and complexity of the pointer or other buffer management system, contributing to a smaller requirement for read/write bandwidth.
Preferably, each subarray can store several frames. By configuring the system to always store the frames in the proper order within a subarray, there is an inherent queuing capability, i.e. at the proper time, frames may be output to the proper output port from a given subarray simply in the order in which they are stored, without the need for storing and managing a pointer for every buffer. The inherent queuing capability provided in subarrays having a larger granularity than one frame reduces the read/write bandwidth requirements.
Preferably, the present invention exploits opportunities presented by more highly integrated memory systems (so-called xe2x80x9cembedded memoryxe2x80x9d) which can more easily accommodate a memory system configured as described herein than previously commonly-used commoditized memory. Thus, according to one embodiment, the present invention exploits opportunities presented by the availability of embedded memory by configuring a system which can provide parallel writes to subarrays, dynamic subarray allocation, multi-frame subarray granularity and/or inherent frame queuing to provide a system which has relatively high usage efficiency of memory without commensurately high read/write bandwidth requirements.