1. Field of the Invention
The present invention relates to a circuit and a method for controlling data transmission. More specifically, the present invention discloses a circuit and a method for aligning transmitted data by adjusting transmission timing for a plurality of lanes.
2. Description of the Prior Art
Generally speaking, data transmission in a computer system requires a data bus used for transferring predetermined data from a source device to a target device. For instance, a widely used PCI bus is capable of providing a bandwidth of 133 MB/s. However, with the development of disk array and gigabit Ethernet, the PCI bus is unable to meet requirements requested by the users. Because the manufacturers of chips have anticipated the above situation, new bus architectures are developed to alleviate loading of the PCI bus. For example, with the development of 3D graphics processing, the PCI bus in charge of transmitting image data between a graphics card and a system memory has its limited bandwidth almost occupied by the image data. Therefore, other peripheral devices, which are connected to the same PCI bus, are greatly affected owing to the image data occupying most of the limited bandwidth. Then, an accelerated graphics port (AGP) architecture is adopted to take the place of the PCI bus for delivering image data. Not only is the loading of the PCI bus reduced, but also the performance of 3D graphics processing is further improved.
As mentioned above, the loading of the PCI bus is increased because of the improvement of the data processing capability of components within the computer system. Therefore, a 3rd generation I/O (3GIO), that is, the PCI Express bus is continuously developing to substitute for the prior art PCI bus so as to provide a required large bandwidth. It is well-known that the PCI Express bus makes use of a higher operating clock and more lanes to boost the bus performance. Please refer to FIG. 1, which is a diagram of a prior art PCI Express bus 11 utilizing a plurality of lanes to transmit data. Suppose that a transmitting device 10 wants to transfer a data stream 14a to a receiving device 12. Because the PCI Express bus 11 provides 4 lanes Lane0, Lane1, Lane2, Lane3, these bytes B0-B7 included in the data stream 14a are respectively transmitted via Lanes Lane0, Lane1, Lane2, and Lane3 when the transmitting device 10 outputs the data stream 14a. In other words, two bytes B0 and B4 are passed to the receiving device 14 through the lane Lane0, two bytes B1 and B5 are passed to the receiving device 14 through the lane Lane1, two bytes B2 and B6 are passed to the receiving device 14 through the lane Lane2, and two bytes B3 and B7 are passed to the receiving device 14 through the lane Lane3. In the end, the receiving device 12 is capable of acquiring the wanted data stream 14a. 
The operating clock applied to the transmitting device 10 is different from the operating clock of the receiving device 12. If the operating clock of the transmitting device 10 has frequency greater than frequency of the operating clock applied to the receiving device 12, the data transfer rate for the data stream 14a outputted from the transmitting device 10 is sure to be greater than the data receiving rate for the data stream 14a received by the receiving device 12. Therefore, a well-known overflow occurs. On the contrary, if the operating clock of the transmitting device 10 has frequency less than frequency of the operating clock applied to the receiving device 12, the data transfer rate for the data stream 14a outputted from the transmitting device 10 is sure to be less than the data receiving rate for the data stream 14a received by the receiving device 12. Therefore, a well-known underflow occurs.
In order to solve the problems generated from a mismatch of the operating clocks on the transmitting device 10 and the receiving device 12, the receiving device 12 has a plurality of elastic buffers to regulate data outputted from the transmitting device 10 and transferred through lanes Lane0, Lane1, Lane2, and Lane3. Based on the specification of the PCI Express bus, the transmitting device 10 outputs ordered sets to make the elastic buffers capable of balancing different operating clocks adopted by the transmitting device 10 and the receiving device 12. For example, each ordered set outputted from the transmitting device 10 includes a COM symbol and three SKP symbols. When an elastic buffer positioned on the receiving device 12 receives a plurality of ordered sets, the elastic buffer reduces the number of SKP symbols in these ordered sets if the operating clock of the transmitting device 10 has frequency greater than that of the operating clock applied to the receiving device 12. Therefore, the data transfer rate of the transmitting device 10 is accordingly reduced, and the above overflow problem is resolved. However, the elastic buffer increases the number of SKP symbols in these ordered sets if the operating clock of the transmitting device 10 has frequency less than that of the operating clock applied to the receiving, device 12. Therefore, the data transfer rate of the transmitting device 10 is accordingly boosted, and the above underflow problem is resolved.
Generally, the transmitting device 10 respectively outputs ordered sets to lanes Lane0, Lane1, Lane2, and Lane3 at the same time. However, the lanes Lane0, Lane1, Lane2, and Lane3 might have different lengths and impedance owing to different circuit layouts. That is, during the data transmission, the lanes Lane0, Lane1, Lane2, and Lane3 might introduce different delays. Therefore, the transmitting timing of the lanes Lane0, Lane1, Lane2, and Lane3 has skews. In other words, the receiving device 12 is unable to process bytes B0, B1, B2, and B3 transmitted via lanes Lane0, Lane1, Lane2, and Lane3 at the same time. With regard to making the receiving device 12 capable of acquiring the wanted data stream 14a, how to align the transmitted data of the lanes Lane0, Lane1, Lane2, and Lane3 becomes an important issue when implementing the PCI Express bus.