A computer system can be broken into three basic blocks: a central processing unit (CPU), a memory, and one or more input/output (I/O) interfaces. These blocks are interconnected by a bus. The blocks communicate to one another by transmission of digital data over the bus. An input device such as a keyboard, mouse, disk drive, analog-to-digital converter, etc., is used to input instructions and data to the computer system via the I/O interface. These instructions and data can be stored in memory. The CPU retrieves the data from the memory and processes it as directed by the computer program. The results can then be stored back into memory or outputted for display via the I/O interface to an output device such as a printer, cathode-ray tube (CRT) display, etc.
In the past, CPUs were rather unsophisticated. They could only process a limited number of bits per clock cycle. For example, early microprocessors tended to be 8-bit machines. Eventually, with rapid advances in semiconductor and computing technology, 16, 32, and even 64-bit microprocessors were developed. These microprocessors were also designed to operate at higher clock frequencies. In addition, the advances in semiconductor technology led to memory chips having greater storage capacities. This combination of increased processing power and storage capacity have lead to more powerful, versatile, and faster computer systems.
Ever greater amounts of data needed to be transmitted between the processor and the memory. Consequently, higher data transmission rates were implemented. But eventually, it became unfeasible to continue increasing the data rate due to physical limitations. One solution to this problem was to increase the bus width. Thereby, more bits of data can be transmitted over the bus per clock cycle. However, utilizing a wider bus is disadvantageous in some aspects. A wider bus consumes more of the silicon area. Hence, a larger die is required. Increasing the die size directly translates into increased production costs because less dies (i.e., chips) can be fabricated for any given wafer. Furthermore, a wider bus consumes more power. Minimizing power consumption is of critical importance in laptop, notebook, and other portable computer systems. Moreover, increasing the bus width adds to the overall complexity of the chip.
In many instances, the benefits conferred by a wider bus width outweighs its attendant disadvantages. However, if a computer program happens to be transmitting many sets of data, one small piece at a time, the overall bus traffic is increased. For certain operations such as string moves, string copies, or bit block transfers in graphics applications, cached memory accesses waste bus bandwidth with read-for-ownership cache line fills, when most accesses are writes in other words, while the individual small pieces of data are being transmitted over the wide bus, other components wishing to transmit data or waiting to receive data must wait until the bus becomes available. Consequently, the data throughput is reduced, notwithstanding a wider bus width.
Therefore, there is a need in the prior art for an apparatus and method for improving the efficiency of bus utilization.