1. Field of the Invention
The present invention relates generally to a cross bar apparatus for transferring data using packets mutually between a plurality of LSI modules such as MPUs and subsystems, and a control method and a program thereof, and more particularly, to a cross-bar apparatus that has improved the throughput of data packets to be written into input queues inside a cross bar, and a control method and a program thereof.
2. Description of the Related Art
Conventionally, a cross bar apparatus has a function for relaying data packets between different LSI modules. In such a cross bar apparatus, input queues for storing packets transferred from the LSI modules are provided. The cross bar apparatus is adapted to write the received packets into the input queues classifying the packets by destination and, thereafter, to select the packets read from the input queues by sorting the packets into output queues each of which is provided for each destination, and to transmit the packets to destination LSI modules from the output queues. FIG. 1 is a block diagram of an input queue unit provided to a conventional cross bar apparatus. In FIG. 1, data packets transferred according to an external clock from an originator LSI module through an external bus are received by a packet receiving unit not shown. In this case, the frequency of the internal clock of the cross bar apparatus is set at ½ of that of the external clock used for transferring the packets mutually between the LSI modules, and the width of internal buses is configured to be two (2) times as big as that of external buses. For example, when the width of the external buses is 36 bits, that of the internal buses is 72 bits by using parallel internal buses. Then, a transferred packet consists of a header and a plurality of words and has a length equal to the length of odd-number words. Having received packets from the external buses, the packet receiving unit outputs headers of the received packets and words following the headers in parallel to the internal buses, classifying these headers and words into those at even-numbered reception timing and those at odd-numbered reception timing defined by the external clock. Then, the headers and the words at even-numbered reception timing are inputted into an even-numbered latch unit 200 and the headers and the words at odd-numbered reception timing are inputted into an odd-numbered latch unit 202. The even-numbered latch unit 200 branches an input path thereof into a header portion passing path 204 and a data portion passing path 206. A header ECC detection/correction unit 208 and a header latch 210 are provided to the header portion passing path 204, and a data latch 212 is provided to the data portion passing path 206. The packets are inputted from a selector 216 through a path 215 into an input queue unit 230. Similarly, the odd-numbered latch unit 202 branches an input path thereof into a header portion passing path 217 and a data portion passing path 218. A header ECC detection/correction unit 220 and a header latch 222 are provided to the header portion passing path 217, and a data latch 224 is provided to the data portion passing path 218. The packets are inputted from a selector 226 through a path 225 into the input queue unit 230. The header detection/correction units 208 and 220 detect and correct errors in the headers of the packets and, after retaining the headers in the header latches 210 and 222, write the headers into the input queue unit 230. Concurrently, having detected the validity of the headers, the header ECC detection/correction units 208 and 220, based on data length information contained in the headers, select the words of the packets corresponding to the data length through the data portion passing paths 206 and 218 and send these words of the packets to the input queue unit 230. The input queue unit 230 is provided with an even-numbered input queue unit 232 and an odd-numbered queue unit 234. Transfer addresses are divided into four transfer address groups in order to improve the throughput by reducing the rate of same addresses, and the even-numbered input queue unit 232 is provided with a first even-numbered queue 236-1, a second even-numbered queue 236-2, a third even-numbered queue 236-3 and a fourth even-numbered queue 236-4 that use respectively a FIFO buffer, for each of the transfer address groups respectively. Similarly, the odd-numbered input queue unit 234 is also provided with a first odd-numbered queue 238-1, a second odd-numbered queue 238-2, a third odd-numbered queue 238-3 and a fourth odd-numbered queue 238-4 respectively for each of the four transfer address groups created by the dividing. In this case, referring to the input queue unit 230 for each of the transfer addresses, taking, for example, a first transfer address group as an example, the unit 230 for the first transfer address group consists of two (2) queues that are the first even-numbered queue 236-1 into which a packet is written when the reception timing of the header of the packet and the words of the packet following the header is even-numbered reception timing of the external clock frequency, and the first odd-numbered queue 238-1 into which a packet is written when the reception timing is odd-numbered reception timing. Similarly, the second transfer address group consists of two (2) queues that are the second even-numbered queue 236-2 and the second odd-numbered queue 238-2, the third transfer address group consists of two (2) queues that are the third even-numbered queue 236-3 and the third odd-numbered queue 238-3, and the fourth transfer address group consists of two (2) queues that are the fourth even-numbered queue 236-4 and the fourth odd-numbered queue 238-4. Furthermore, each of the queues 236-1 to 236-4 and 238-1 to 238-4 provided to the input queue unit 230 uses a register file that can be written and read in one (1) clock cycle, and is adapted to consist of eight (8)-stage packet storage sectors to allow up to eight (8) packets to be stored therein.
FIG. 2 shows latch timing of a packet for which the reception timing of the header thereof is even-numbered reception timing of the external clock, and writing of the packet into the input queue unit, in terms of the even-numbered latch unit 200, the odd-numbered latch unit 202, the first even-numbered queue 236-1 and the first odd-numbered queue 238-1 of the first address transfer group of FIG. 2. However, the even-numbered latch unit 200 and the odd-numbered latch unit 202 are respectively shown as one (1) latch for simplification of the description though those latch units 200 and 202 respectively has a header latch and a data latch. In FIG. 2, it is assumed that a packet comprising a header H and data words D0 to D7 and having the length equal to the length of nine (9) words is received synchronizing with the external clock. In this case, the header H and the words D1, D3, D5 and D7 are respectively received at even-numbered timing of the external clock and, therefore, are inputted into the even-numbered latch unit 200 one after another while the words D0, D2, D4 and D6 are respectively received at odd-numbered timing of the external clock and, therefore, are inputted into the odd-numbered latch unit 202 one after another. Thus, the even-numbered latch unit 200 and the odd-numbered latch unit 202 latch the words of the received packet including the header one after another by two (2) words at each time, and write the header and the words by two (2) words at each time into the first even-numbered queue 236-1 and the first odd-numbered queue 238-1 using the paths 215 and 225 alternately between the latches. The word D7 at the end of the packet is written as “one (1)-word writing” into the first even-numbered queue 236-1.
FIG. 3 shows latch timing of a received packet for which the reception timing of the header thereof is odd-numbered reception timing of the external clock, and writing of the packet into the input queue unit. In this case, the header H and the words D1, D3, D5 and D7 are respectively received at odd-numbered timing of the external clock and, therefore, are inputted into the odd-numbered latch unit 202 one after another while the words D0, D2, D4 and D6 are respectively received at even-numbered timing of the external clock and, therefore, are inputted into the even-numbered latch unit 200 one after another. As to the header H at the head of the received packet, because no data are present in the even-numbered latch unit 200 when the header H is latched by the odd-numbered latch unit 202, the header H is written as one (1)-word writing into the first odd-numbered queue 238-1. As to the words D0 to D7 of the packet following the header H, because the words D0 to D7 are latched by two (2) words at each time one after another by the even-numbered latch unit 200 and the odd-numbered latch unit 202, the words D0 to D7 are written into the first even-numbered queue 236-1 and the first odd-numbered queue 238-1 by two (2) words at each time using the paths 215 and 225 alternately for each of the latches. However, in this writing of a received packet into a conventional input queue unit, in the case where packets addressed to the same destination and respectively having the length equal to the length of odd-number words are sequential, a problem that the throughput is reduced occurs because voids may exist between packets when packets are read and transferred from the input queue unit depending on the timing to write the packets into queues.
FIG. 4 shows the state of writing into the first even-numbered queue 236-1 and the first odd-numbered queue 238-1 when packets addressed to the same destination and having the length equal to the length of five (5) words are sequentially received at even-numbered timing of the external clock. As to the first received packet, because a header H and words D1 and D3 of the packet are received at even-numbered timing of the external clock, the header H and the words D1 and D3 are written into the first even-numbered queue 236-1. Because words D0 and D2 of the packet are received at odd-numbered timing of the external clock, the words D0 and D2 are written into the first odd-numbered queue 238-1. The same process is applied also to the next received packet. When the packets are read from the input queue unit that stores the packets received sequentially at even-numbered timing of the external clock as described above, the packets are read setting read pointers P1 to P3. In this case, as to the read pointers P1 and P2, reading is parallel reading of two (2) words. However, as to the last data D3 of the packet, reading is one (1)-word reading by the read pointer P3 and, therefore, avoid 242 which contains no data is generated between the packets including a read packet 240 when the packets are read sequentially. Therefore, the throughput is reduced.
FIG. 5 shows the case where packets are read from the input queue unit that stores the packets which are addressed to the same destination and have been received at odd-numbered timing of the external clock. Though the packets are read setting the read pointers P1 to P3, the reading of the header H by the pointer read P1 is one (1) -word reading and, therefore, a void 246 which contains no data is generated between the packets including a read packet 244 when the packets are read sequentially. Therefore, the throughput is reduced. Programs executable by processors to control cross bar apparatuses are usually stored on computer readable storage media, such as RAM, ROM, CDs, etc.