Telecommunications channels often carry traffic that is multiplexed from several sources. For example, a 2.488 Gb/s SONET STS-48 channel carries 48 51.84 Mb/s SONET STS-1 channels that are time multiplexed on a byte-by-byte basis. That is, the channel carries bytes 1.1, 2.1, 3.1, . . . , 48.1, 1.2, 2.2, 3.2, . . . , 48.2, 1.3, 2.3, 2.3, . . . where n.m denotes byte m of subchannel n. Details of the SONET format can be found in Ming-Chwan Chow, Understanding SONET/SDH: Standards and Applications, Andan Pub, ISBN 0965044823, 1995 and in ANSI Standard T1.105-1995.
An STS-1 SONET frame is a repeating structure of 810 bytes arranged into 9 rows of 90 columns. The frame structure is transmitted in row-major order. That is, all 90-bytes of row 0 are transmitted, then all 90 bytes of row 1, and so on. At higher multiplexing rates, each byte of the STS-1 frame is replaced by a number of bytes, one from each of several multiplexed sources. For example, at STS-48, 48 bytes, one from each of 48 STS-1 subframes, are transmitted during each column interval. In this case, the order of transmission is to send all 48 subframe bytes for one column before moving on to the next column and to send all of the columns of a row before moving on to the next row.
A digital cross connect is a network element that accepts a number of multiplexed data channels (e.g., 72 STS-48 channels) and generates a number of multiplexed output channels where each output channel carries an arbitrary set of the subchannels from across all of the input ports. For example, one of the STS-48 output channels may contain STS-1 channels from different input channels in a different order than they were originally input.
An example of digital cross connect operation is shown in FIG. 1. The figure shows a cross connect 30 with two input ports and two output ports. Each of these ports contains four timeslots. Input port 1 (the top input port) carries subchannels A, B, C, and D in its four slots and input port 2 (the bottom port) carries subchannels E, F, G, and H in its four timeslots. Each timeslot of each output port can select any timeslot of any input port. For example, output port 1 (top) carries subchannels H, D, F, and A from 2.4, 1.4, 2.2, 1.1 where x.y denotes input port x, timeslot y. Input timeslot must be switched in both space and time. The first timeslot of output port 1, for example, must be switched in time from slot 4 to slot 1 and in space from port 2 to port 1. Also, some timeslots may be duplicated (multicast) and others dropped. Subchannel A, for example, appears in output timeslots 1.4 and 2.2 and subchannel G is dropped, appearing on no output timeslot.
A digital cross connect can be implemented in a straightforward manner by demultiplexing each input port, switching all of the timeslots of all of the input ports with a space switch, and then multiplexing each output port. This approach is illustrated in FIG. 2. The four timeslots of input port 1 are demultiplexed in demultiplexers (Demux) 32 such that each is carried on a separate line. All of these demultiplexed lines are then switched by a space switch 34 to the appropriate output timeslots. Finally, a set of multiplexers (Mux) 36 multiplexes the timeslots of each output channel onto each output port. This approach is used, for example, in the systems described in U.S. Pat. Nos. 3,735,049 and 4,967,405.
The space-switch architecture for a digital cross connect as shown in FIG. 2 has the advantage that it is conceptually simple and strictly non-blocking for arbitrary unicast and multicast traffic. However, it results in space switches that are too large to be economically used for large cross connects. For example, a digital cross connect with R=72 ports and T=48 timeslots requires a RTxc3x97RT (3456xc3x973456) space switch with R2T2=11,943,936 cross points. Further, this large switch will be operated at a very slow rate. It will only need to switch a new batch of input timeslots after T bytes have been received. Thus, it operates at 1/T the byte rate.
A more economical digital cross connect can be realized using a three-stage time-space-time (T-S-T) switch architecture as illustrated in FIG. 3. Here each input port is input to a time-slot interchanger (TSI) 38. A TSI switches a multiplexed input stream in time by interchanging the positions of the timeslots. To switch time-slot i to time-slot j, for example, slot i is delayed by T+jxe2x88x92i byte times. The multiplexed streams out of the input TSIs are then switched by a Rxc3x97R space switch 40 that is reconfigured on each timeslot. The outputs of this space switch are switched in time again by a set of output TSIs 42. This T-S-T architecture is employed, for example, by the systems described in U.S. Pat. Nos. 3,736,381 and 3,927,267.
An example of the operation of a T-S-T digital cross connect on the configuration of FIG. 2 is shown in FIG. 4. Here the TSI for input port 1 does not change the positions of its input timeslots. The input TSI for port 2, however, reorders its timeslots from E, F, G, H, toxe2x80x94, F, H, E. The G here is dropped as it is not used by any output ports. The space switch takes the outputs of the two input TSIs and switches them, without changing timeslots, to create the streams A, F, H, D and A, B, C, E. Note that this involves a multicast of timeslot A to both outputs. Finally, the output TSIs reorder these streams to give the output streams H, D, F, A and E, A, B, C.
A three-stage T-S-T digital cross connect is logically equivalent to a 3-stage Clos network with R Txc3x97T input stages, T Rxc3x97R middle stages, and R Txc3x97T output stages. To route a configuration of input timeslots to output timeslots on such a switch a middle-stage timeslot must be assigned to each connection. This routing is described in detail in Clos, Charles, xe2x80x9cA Study of Non-Blocking Switching Networksxe2x80x9d, Bell System Technical Journal, March 1953, pp. 406-424, and V. E. Benes, xe2x80x9cOn Rearrangeable Three-Stage Connecting Networksxe2x80x9d, The Bell System Technical Journal, vol. XLI, No. 5, September 1962, pp. 1481-1492.
Digital cross connects, including grooming switches, typically have several disadvantages. First, as illustrated in FIG. 2, the size of fully demultiplexed grooming switches typically increase quadratically with the number of timeslots times the number ports. For example, with the simple DEMUX/MUX architecture, multiplexed input traffic is demultiplexed into its constituent timeslots. For STS-48 traffic, 48 individual byte-wide buses corresponding to 48 timeslots must be input into the switch. Thus, if the port count is 72 ports, 3456 byte-wide buses must be coupled to the inputs of the switch. This results in some switch architectures being physically unrealizable due to size requirements.
With multi-staged switch architectures, as illustrated in FIGS. 3 and 4, the layout size issues are less dramatic. However, high latency, in the order of milliseconds, is associated with reconfiguration of input-output connections. Input-output connections are associations between input timeslots and output timeslots that define data paths through the switch in space and time. Such input-output connections may include input-output permutations and multicast connections. The source of such latency typically stems from complex scheduling computations used by multi-stage cross connects to reconfigure these connections. Such computations typically involve the selection of a middle-stage timeslot to route calls from a particular input timeslot to a particular output timeslot.
Embodiments of the invention provide a switch that switches streams of multiplexed traffic in both time and space domains. Such embodiments implement a distributed demultiplexing architecture for switching between any input timeslot to any output timeslot at a reduced layout size. Furthermore, such embodiments also result in low latencies being associated with reconfiguration of input-output connections on the order of nanoseconds.
Embodiments of the invention include a number of inputs receiving data from external input links and a number of outputs transmitting data to external output links. A distributed demultiplexing switch architecture is implemented that includes intermediate storage units that are coupled to each of the inputs. Each intermediate storage unit stores input data from an input and provides an interface between the input and a subset of the outputs. The subset of outputs may include multiple outputs. Programmable selection storage enables the transfer of selected data from the intermediate storage units to the outputs.
Each intermediate storage unit may include P read ports with R/P intermediate storage units coupled to each input. According to one embodiment, P may be equal to eight (8) ports.
Each intermediate storage unit may include 2N locations, where N is the number of multiplexing intervals in a multiplexing cycle. According to one embodiment N is equal to forty-eight (48) multiplexing intervals. For each intermediate storage unit, a first portion of the 2N locations store a current column from an N STS-1 frame, while a second portion of the 2N locations store a previous column from an N STS-1 frame. The second portion may be addressable as N STS-1 timeslots.
According to a further embodiment, each intermediate storage unit may include N locations, where N is the number of multiplexing intervals within a multiplexing cycle. According to one embodiment N is equal to forty-eight (48) multiplexing intervals. Since reads and writes of such intermediate storage units access the same locations, delay memory is coupled to each output. When the output reads current data from the selected intermediate storage unit, the output reads from the delay memory. When the output reads previous data from the selected intermediate storage unit, the output reads from the selected intermediate storage unit.
The programmable selection storage provides an address signal to select data from an intermediate storage unit and an enable signal to enable output from one of the intermediate storage units that are coupled to different inputs. According to one embodiment, the selection storage includes a number of selection storage units with each being associated with an output.
Further embodiments of the invention provide additional reductions in the size of a switch layout through xe2x80x9cmulti-pumping.xe2x80x9d With multi-pumping, each read port of the intermediate storage unit is coupled to multiple outputs, which are enabled successively. According to one embodiment, two or more outputs are coupled to each of the P read ports of an intermediate storage unit. The intermediate storage unit is read from the two or more outputs within a single clock cycle, reducing the number of intermediate storage units per input.
The intermediate storage unit may be a demultiplexing register file (DRF). According to one embodiment, a demultiplexing register file may comprise a cell array including at least N locations for storing data from an input timeslot and a write select coupled to the cell array for enabling a location in the cell array to be written with data from one of the input timeslots. A DRF may further include a number of read decoders coupled to the cell array with each read decoder coupled to a selection storage unit. Each read decoder receives an address signal from the selection storage unit and selects data from a location in the cell array with the address signal for reading to an output. A DRF may further include a comparator that receives an enable signal from the selection storage unit and compares the enable signal to an input port identifier. If the enable signal matches the input port identifier, the comparator enables the selected data from the cell array onto the output.
Embodiments of a cell array for a DRF includes a read circuit, at least one storage cell, and at least one write circuit. The write circuit transfers data from an input into the storage cell, while the read circuit drives the value in the storage cell onto an output. The cell array may include two or more storage cells with the read circuit being shared across the two or more storage cells. The read circuit is driven by a multiplexer, which selects a storage cell from the two or more storage cells having a value to be read onto an output.
Embodiments of the cell array further include a write select circuit and two or more write circuits. The write select circuit enables the two or more write circuits to write in succession. According to a further embodiment, the two or more storage cell may include a master storage cell and a slave storage cell. The at least one write circuit writes data into the master storage cell. The master storage cell, in turn, transfers the data into the slave storage cell. Finally, the data is read from the slave storage cell onto an output by the read circuit.
According to another embodiment of the invention, the configuration of the switch may be reconfigured, such that input-output connections may be modified dynamically without the corruption of frame data. Such embodiments are referred to as hitless configuration switching. Configuration switching may be implemented by rewriting the input-output connections defined within the selection storage units for each output. Embodiments for hitless configuration switching may include each output processor of each output overwriting all of the subframes of a first column of a frame with a fixed value (i.e., xe2x80x98F6xe2x80x99 for SONET frames). This ensures that the beginning of a new input frame is not corrupted due to the reconfiguration of the input-output connections.
According to an alternative embodiment for hitless configuration switching, each of the inputs includes an input processor, while each of the outputs includes an output processor. Each input processor writes columns of an input frame to intermediate storage units coupled to the input. On the output side, each output processor reads a column for an output frame from intermediate storage units or delay memory, which are coupled to the output. To ensure hitless configuration switching, the intermediate storage units may operate at a higher frequency than the frequency of the input processor and the output processor. According to one embodiment, the intermediate storage units may operate at a frequency that is C+1/C times the frequency of the input processors and the output processors, where C is the number of column intervals in a frame. In other words, the intermediate storage units may operate at a frequency such that the intermediate storage units have C+1 columns during a frame period, while the input processors and the output processors having C columns during the same frame period. According to one embodiment, C is equal to 810 columns. No writes are made to the intermediate storage unit during the C+1st column of a frame and no data is output to the output processors during the first column of a frame. According to one embodiment an input FIFO (first-in, first out queue) is coupled between the input processor and an intermediate storage unit and an output FIFO is coupled between the intermediate storage unit and the output processor.
The invention is particularly applicable to grooming switches, which are cross-connect switches that internally aggregate and segregate data for efficient traffic routing. Aggregation is the combining of traffic from different locations onto one facility. Segregation is the separation of traffic. For instance, a SONET grooming switch having 72 STS-48 input and output ports with STS-1 granularity routes any of one of the 72xc3x9748=3,456 input STS-1 signals to any one of the 3,456 output STS-1s. Such a grooming switch is non-blocking for unicast traffic, where xe2x80x9cblockingxe2x80x9d occurs when an active input cannot be connected to an output.