The present invention relates generally to data communication switching systems and more particularly relates to an apparatus for and method of flow switching in a data communications network.
More and more reliance is being placed on data communication networks to carry increasing amounts of data. In a data communications network, data is transmitted from end to end in groups of bits which are called packets, frames, cells, messages, etc. depending on the type of data communication network. For example, Ethernet networks transport frames, X.25 and TCP/IP networks transport packets and ATM networks transport cells. Regardless of what the data unit is called, each data unit is defined as part of the complete message that the higher level software application desires to send from a source to a destination. Alternatively, the application may wish to send the data unit to multiple destinations.
Currently, many switching systems utilize switching fabrics or matrixes that are designed to switch either fixed or variable length frames or packets of data. The frames are received from one or more input ports and fed into the switching matrix. The header of each frame is used to calculate a forwarding decision for the particular frame. The forwarding decision is used by the switching matrix to steer the frame to one or more output port(s).
A bottleneck in prior art switching systems is the time delay required to calculate a forwarding decision for each frame. Frames input to the matrix must wait for the forwarding decision before they can be switched, i.e., output to a port. Thus, there is a strong felt need for a mechanism whereby frames can be switched at line or wire speed without the delays associated with the calculation of the forwarding decision.
The present invention is an apparatus and associated method for performing flow switching in a frame-switching environment. Flow switching is applicable to data communication systems wherein data is switched on a frame basis utilizing separate data and control paths. The invention provides a means of implementing flow switching which functions in the control path of the frame switching process.
Three main operations are involved in switching frames at very high rates. The first operation is the control path that involves analysis of the frame header and subsequently making a forwarding decision as to where to send the frame. The second operation involves the movement of the data from the input port to an output port. Typically, this operation is performed within the data path structure due to the necessity to read and write bits and/or bytes to and from high bandwidth memories, backplanes, etc. at very high speed. The third operation is the frame formatting manipulation that is based on the forwarding decision and the type of the frame to be output. All three operations must be performed fast enough to keep up with the data rate of the incoming frames and the required output data rate of the frames to be output.
The flow switching apparatus of the present invention is mainly applicable to the first and third operations described above, i.e., the control path and frame formatting. The invention assumes that the second operation is performed using well-known standard techniques and is thus outside the scope of the present invention.
The flow switching apparatus provides a mechanism whereby flows are identified and tags assigned to them, The tags assigned are relatively short in length, e.g., 16 bits, while the frame headers may be tens of bytes. The forwarding decisions are calculated once and stored in a cache or a look up table that can be accessed very quickly. The tag is used as an index into the cache or LUT (Look-Up Table) such that if an entry is found in the cache or LUT for that particular tag, the forwarding decision does not have to be made again. This serves to greatly reduce the time to obtain a forwarding decision that is used by the switching fabric in steering the input frame to the appropriate output port(s). space and time whereby there are only a finite number of tags that are applicable, i.e., in use, at any one time. The principle of locality guarantees that in most cases, for a given port and during a given time period, only a subset of the entire range of possible flows are present in the system. It is in cases such as this that caches become very effective since they expedite the performance for those flows captured inside the cache.
The assignment of tags to flows expedites the processing of the frames in network entities that implement the apparatus and method of the present invention. The use of cache provides a means of identifying data items or entries in the cache, such as forwarding decisions and header substitutions.
The present invention functions to assign tags based on a comparison of the frame headers without the need to process the contents of the frame or packet itself in accordance with one or more complex rules. Thus, the assignment of tags is not complex. The forwarding decisions, however, may be governed by one or more complex rules but this processing occurs in the switching controller portion of the invention and not the I/O processor portion. The tag, unrecognized at first, is processed by the switching controller (typically using slower speed hardware or software processing) and a forwarding decision is made. The results of the decisions are placed in a cache for future reference. In particular, the forwarding decisions and the new header to be substituted for the old one are placed in the cache.
To achieve fast substitution of tags, i.e., fast processing of frames, a fast cache, preferably hardware based, is used. Note that the invention may be implemented using multiple levels of cache. A tag assignment cache is located in the I/O processor and is typically limited in size. Preferably, the primary cache is hardware based and the fastest, following by fast hardware forwarding, then a fast software cache and then a slow software forwarding process. The primary cache is preferably bigger than the individual tag assignment caches in the I/O processors, but may be less than the aggregate sum of all the tag assignment caches. The secondary cache, is preferably at least equal in size to the sum of all the tag assignment caches in the I/O processors. Thus, together, the primary and secondary caches cover all the tag assignment caches in the individual I/O processors. Note that it is not mandatory that the secondary software cache be equal to or bigger than the aggregate sum. If it is, however, it simplifies the cache update mechanism. When a new tag is assigned, replacing a previous entry becomes simpler.
The present invention also comprises various tag queues that are used in the forwarding process. Frames whose tags are found in the primary cache are placed on a fast queue where they are processed and output quickly. Frames whose associated tags cause a miss on the primary cache are placed on a slower queue whereby the forwarding decision may be calculated using hardware or software. A plurality of tag queues are used with each tag queue storing frames associated with an individual tag. Thus, all frames of the same flow are switched together once their forwarding is resolved.
Note that an important aspect of the invention is that the various components within a network element and between different network elements do not need to communicate with each other in order to synchronize tags. Each entity, i.e., input port, network element box, etc. is adapted to analyze, identify and assign tags to the respective flows, i.e., frames, passing through each.
There is provided in accordance with the present invention, in a data communications network, a method of switching data utilizing tags, the method comprising the steps of identifying flows from a plurality of data frames received from the network via a plurality of input ports, assigning a tag to each unique flow identified and storing each unique flow and associated tag in a tag assignment cache, calculating a forwarding decision and substitute header for each unique flow received and storing the results of the forwarding decision, substitute header and the tag associated therewith in a tag forwarding cache, retrieving a forwarding decision and substitute header associated with a tag from the tag forwarding cache upon the occurrence of a cache hit on the tag, modifying the received data frame in accordance with the forwarding decision and substitute header corresponding to the tag previously assigned to the data frame and forwarding the data frame with the substitute header to the appropriate output port in accordance with the forwarding decision corresponding thereto.
The steps of identifying and assigning are performed independently of and without any communications required to other network entities. The tag forwarding cache comprises a primary tag forwarding cache implemented in hardware and a secondary tag forwarding cache implemented in software adapted to operate at a slower speed than the primary tag forwarding cache. Each flow comprises the frame header or portions thereof that embody the criteria for identifying the particular flow. The tag assignment cache comprises records containing fields representing a flow ID, flow description and tag assignment. The tag forwarding cache comprises records containing fields representing a tag ID, forwarding data and substitute header. The method further comprises the step of providing a third cache adapted to store the results of forwarding decisions and their associated tags, the third cache having a number of entries at least equal to the aggregate sum capacity of the first caches in each of the plurality of input ports, and whereupon a miss in the second cache the forwarding decision is retrieved from the third cache upon a hit on the tag.
In addition, the method further comprises the steps of storing incoming frames in one of a plurality of slow queues in the event a frame generates a miss on the tag forwarding cache requiring the calculation of a forwarding decision, each the slow queue associated with a separate tag and forwarding all frames within a slow queue at one time once a forwarding decision has been calculated for the flow corresponding to the frames.
There is also provided in accordance with the present invention, in a data communications network, a method of switching data utilizing tags, the method comprising the steps of identifying flows from a plurality of data frames received from the network via an input port, assigning a tag at random to each unique flow identified and storing each unique flow and associated tag in a first cache, retrieving a forwarding decision associated with the tag from a second cache upon the occurrence of a cache hit on the tag, calculating a forwarding decision for the flow associated with a tag if the tag is not found in the second cache, storing the results of the forwarding decision and the tag associated therewith in the second cache and forwarding the data frame to the appropriate output port in accordance with the forwarding decision corresponding thereto.
There is further provided in accordance with the present invention a tag switching apparatus for use in a data communications network comprising an I/O processor comprising a plurality of input ports, each input port comprising a first cache for storing tags and flows associated therewith, a tag processor for identifying flows from a plurality of data frames received from the network via an input port, and for assigning a tag at random to each unique flow identified such that no two flows are assigned the same tag and storing each unique flow and associated tag in the first cache, a controller comprising a second cache and a header processor adapted to retrieve a forwarding decision associated with a tag from the second cache upon the occurrence of a cache hit on the tag, the header processor adapted to calculate a forwarding decision for the flow associated with a tag upon a miss in the second cache and to store the results of the forwarding decision and the tag associated therewith in the second cache.
The apparatus further comprises a switching fabric adapted to forward data frames to their appropriate output port in accordance with the forwarding decision corresponding thereto. The second cache comprises a cache implemented using hardware. The apparatus further comprises a third cache adapted to store the results of forwarding decisions and their associated tags, the third cache having a number of entries at least equal to the aggregate sum capacity of the first caches in each of the plurality of input ports, and whereupon a miss in the second cache the forwarding decision is retrieved from the third cache upon a hit on the tag.
The apparatus further comprises a plurality of slow queues for storing incoming frames in the event a frame generates a miss on the second cache requiring the calculation of a forwarding decision, each the slow queue associated with a separate tag, and once a forwarding decision has been calculated for the flow corresponding to the frames, all frames within a slow queue are forwarded at one time.