Computer systems have historically included at least one main electronics enclosure which is arranged to have mounted therein a plurality of circuit boards. The circuit boards, in turn, typically include a large number of integrated circuits (ICs) or so called "chips". With continuing advances in circuit integration and miniaturization, more and more of the individual chips are being integrated into fewer "full function" chips which include more functionality in less space.
As chip densities continue to increase, more of the computer system functions such as audio, video, and graphics, which have heretofore been normally coupled to a processor at the card level, are now being integrated onto the same IC as the system processor. With this increase in the number of functions being combined in a single IC, the bandwidth requirements of on-chip busses have also increased. As a result, several "on-chip" bus architectures are being developed to address the on-chip communication of processor cores and peripherals.
Most of the on-chip architectures have used the same bus architecture techniques that have been used for off-chip busses. For example, a single data bus is normally used for both read and write operations. Master and slave devices attached to the bus share the common read/write data bus. That technique is the most widely used architecture due to the limited I/O that are available at the chip boundary of the bus masters and slaves. Several bus architectures even share one common bus for the address transfer as well as read and write data transfers. That methodology requires that the address transfer phase be performed prior to the data transfer phase, which, in turn, reduces the bandwidth of the bus.
An on-chip bus is not limited by the number of I/O that an interface may contain. Many of the off-chip architectures have been optimized to reduce the number of I/O pins due to package constraints and degrade the performance of the bus as a result. On-chip busses may have many more interface signals without the associated cost of high pin count packages. Separate address, read data and write data busses are also feasible since the penalty for additional I/O is reduced. As a result, new bus architectures are being developed which take advantage of separate read and write data busses. An implementation of such an architecture is the so-called Processor Local Bus or PLB architecture. The PLB design contains a processor, a DMA controller, an on-chip peripheral bus (OPB) bridge, and external bus interface unit.
In the design of an embedded processor, an on-chip bus architecture is required to provide high bandwidth for the processor and for the Direct Memory Access (DMA) controller to access memory as well as internal and external DMA peripherals. The external DMA peripherals reside off-chip on the memory data bus. The internal DMA peripherals reside on an on-chip peripheral bus (OPB). The external bus interface controller (EBIU) connects the external bus to the processor local bus (PLB). The OPB bridge connects the PLB to the OPB.
Typically, data transfers across the external bus or the OPB require internal buffering by the EBIU or OPB bridges. This sequential transfer process involves two separate transfer cycles on the processor bus. Accordingly, there is a need for an enhanced method and processing apparatus which is effective to enhance the data transfer performance by reducing the processor bus utilization required by a DMA to service peripheral devices.