Serializers (also known as parallel-to-serial converters or parallel-in serial-out [PISO] circuits) are widely used in data communication systems to convert parallel data into a serial data stream. Serializers are important components in communication network transmissions. A serializer converts a relatively low speed parallel data stream into a relatively high speed serial data stream. Since such a circuit produces high speed serial data, it consumes a significant amount of power in a serial communication network. Improvements made to reduce the power consumed by such a circuit will generally reduce the power consumed by serial communication network equipment.
Serial communication systems often employ an 8b/10b encoding scheme. 8b/10b encoding encodes 8-bit data into 10 bits. The encoding generally improves the physical signal and facilitates bit synchronization, error detection, and control character (i.e., the Special Character) encoding. 8b/10b encoding is used in high speed data communication protocols including Fibre Channel, Gigabit Ethernet, 10 Gigabit Ethernet, and ATM transmission interfaces. An 8b/10b encoder typically provides a 10-bit parallel output. Therefore, serializers used in such systems generally must serialize 10-bit parallel data.
FIG. 1 shows a conventional parallel load and shift register circuit 100 for serializing 10-bit parallel data. This circuit loads the 10-bit parallel data during the first clock cycle and then shifts the data for the next 9 clock cycles. Output 130 of last flip-flop 120 thereby produces a serial data stream corresponding to the parallel input data.
FIG. 2 shows a timing diagram corresponding to serializer circuit 100 of FIG. 1. When load signal 203 is high, circuit 100 loads a sample of the 10 bit parallel input stream (e.g., Data<0:9>) into flip-flops 120 to 129. When the load signal is low, the flip-flops are connected as a shift register. Thus, during 10 clock cycles, 10 bits are shifted out of the last flip-flop, generating the serialized data stream. Serializer 100 also includes divider 154 configured to generate a divide-by-10 signal (e.g., signal 202 of FIG. 2) and a load signal 150 (e.g., signal 203 of FIG. 2). Divider 154 generally runs at the clock frequency. A divide-by-10 divider typically employs 4 flip-flops, so serializer circuit 100 typically requires 14 flip-flops (10 flip-flops 120-129 in the data-path, and 4 flip-flops in divider 154) operating at the clock frequency. The power dissipated by the switching of the clock signal generally contributes a majority of the power consumed in such a circuit. Therefore, average power dissipated by the clock signal Ps may be calculated according to the equation:
                                                                        P                s                            =                                                10                  *                                      CV                    2                                    ⁢                  f                                +                                  4                  *                                      CV                    2                                    ⁢                  f                                                                                                                        =                                  14                  *                                      CV                    2                                    ⁢                  f                                            ;                                                          (        1        )                            where C is the input capacitance of the clock pin of the flip-flop,        V is the power supply voltage, and        f is the clock frequency.        
A similar equation can be derived for an 8-bit parallel load and shift serializer:
                                                                        P                s                            =                                                8                  *                                      CV                    2                                    ⁢                  f                                +                                  3                  *                                      CV                    2                                    ⁢                  f                                                                                                        =                              11                *                                  CV                  2                                ⁢                                  f                  .                                                                                        (        2        )            
A tree-based serializer generally dissipates or consumes less power than a conventional parallel load and shift serializer. FIG. 3 shows conventional tree-based serializer circuit 300, which includes MUX 311 operating at half clock frequency (Cdiv2), and MUXs 312 and 313 operating at ¼th of clock frequency (Cdiv4). In circuit 300, only the last flip-flop 301 operates at the clock frequency (Clock). Flip-flops 302 and 303, in the previous stage, work at Cdiv2. By extension of the 4-bit serializer shown in FIG. 3, an 8-bit serializer may have one flip-flop (e.g., flip flop 301) operating at clock frequency (Clock), two flip-flops (e.g., flip-flops 302 and 303) operating at half clock frequency (Cdiv2), and 4 flip-flops (not shown in 4-bit serializer 300) operating at ¼th of clock frequency (Cdiv4). In the dividers (e.g. divide-by-2 dividers 320 and 321) one flip-flop operates at clock frequency (Clock), one at half clock frequency (Cdiv2), and another at ¼th of clock frequency (Cdiv4). Therefore, average power dissipated by the clock signal, Pt may be calculated according to the equation:
                                                                                                   P                  t                                =                                ⁢                                                                            CV                      2                                        ⁢                    f                                    +                                      (                                          2                      *                                              C                        1                                            ⁢                                              V                        2                                            ⁢                                              f                        /                        2                                                              )                                    +                                      (                                          4                      *                                              C                        1                                            ⁢                                              V                        2                                            ⁢                                              f                        /                        4                                                              )                                    +                                                                                                                        ⁢                                                                            CV                      2                                        ⁢                    f                                    +                                                            CV                      2                                        ⁢                                          f                      /                      2                                                        +                                                            CV                      2                                        ⁢                                          f                      /                      4                                                                                                                                                                =                                    ⁢                                                            (                                              2.75                        *                                                  CV                          2                                                ⁢                        f                                            )                                        +                                          (                                              2                        *                                                  C                          1                                                ⁢                                                  V                          2                                                ⁢                        f                                            )                                                                      ;                                                                        (        3        )                            where C is the input capacitance of the clock pin of the flip-flop,        C1 is the sum of C plus the capacitance of the select pin of the multiplexer (normally C1<2*C),        V is the power supply voltage, and        f is the clock frequency.        
The first three terms in Equation (3), (CV2f+(2*C1V2f/2)+(4*C1V2f/4)) correspond to the clock power dissipated in the tree-based serializer. The last three terms of Equation (3), (CV2f+CV2f/2+CV2f/4), correspond to the clock power dissipated in the divider. Comparing the average power dissipation of an 8-bit parallel load and shift serializer to the average power dissipation of an 8-bit tree-based serializer, the tree-based serializer architecture reduces average power dissipated by between 40% (in the worst case, when C1=2*C) and 55% (in the best case, when C1=C). Unlike the parallel load and shift serializer, the tree-based serializer includes only one flip-flop operating at the clock frequency in the data path. In addition to improved power consumption, this feature reduces constraints (e.g., timing constraints) on the circuit layout of the tree-based serializer, in comparison to the parallel load and shift serializer.
A tree-based serializer, however, generally requires parallel input data of 2r bits, where r is an integer of at least 1 (e.g., 2 bits, 4 bits, 8 bits, 16 bits, etc.). As described above, many digital communication systems demand serializers for 10-bit parallel data. Thus, a conventional tree-based serializer cannot be used, because 10 is not a power of two. Therefore it is desirable to provide high speed and relatively low power serialization of M-bit parallel data streams, where M is not a power of two.