Recent advances in silicon densities allow for the integration of numerous functions onto a single silicon chip. With this increased density, peripheral devices formerly attached to a processor at the card level are integrated onto the same die as the processor. This type of implementation of a complex circuit on a single die is referred to as a system-on-a-chip (SOC). With a proliferation of highly integrated system-on-a-chip designs, the shared bus architecture that allows major functional units to communicate is commonly utilized. There are many different shared bus designs which fit into a few distinct topographies. A known approach in shared bus topography is for multiple masters to present requests to an arbiter of the shared bus for accessing an address range of an address space, such as an address space of a given slave device. The arbiter awards bus control to the highest priority request based on a request prioritization algorithm. As an example, a shared bus may include a Processor Local Bus that may be part of a CoreConnect bus architecture of International Business Machines Corporation (IBM).
Thus, a system-on-a-chip or Ultra Large Scale Integration (ULSI) design, typically comprises multiple masters and slave devices connected through the Processor Local Bus (PLB). The PLB consists of a PLB core (arbiter, control and gating logic) to which masters and slaves are attached. The PLB architecture typically supports up to 16 masters. A master can perform read and write operations at the same time in an address-pipelined architecture, because the PLB architecture has separate read and write buses. However, the PLB architecture cannot initiate requests for both a read and a write at the same time. In a given system-on-a-chip (SOC) application, PLB bus utilization can be improved using the overlapped read and write transfer feature of the PLB architecture.
As mentioned, one example of a bus utilized by SOC computer systems is the CoreConnect. TM PLB. In an SOC with this PLB architecture, each device attaches to a central resource called the “PLB Macro”. The PLB Macro is a block of logic that acts as the bus controller, interconnecting all the devices (masters/slaves) of the SOC. The PLB Macro primarily includes arbitration functions, routing logic, buffering and registering logic. The devices communicate over the bus via a PLB protocol in a synchronous manner. The protocol includes rules that control how transmission processes are to be completed, including, for example, the number of clock cycles taken to perform certain sequences. Among these sequences are (1) the time from a request at the initiating device to a snoop result at the initiating device, and (2) the time from read data at a source device (the target) to read data at the destination device (the initiator), etc.
In a typical architecture that includes a PLB, each master is in electrical communication with the PLB core via at least one dedicated port or line. The multiple slaves in turn, are connected to the PLB core via a PLB shared data bus and a command bus allowing each master to communicate with each slave connected to the PLB shared data bus and the command bus. Each slave has address, which allows a master to select and communicate with a particular slave among the plurality of slaves. When a master wants to communicate with the particular slave, the master sends certain information to the PLB core for distribution to the slaves. An example of this information is the selected bus command, the write_data command and the address of the slave.
If the slave address sent by the master matches the address of a slave, then that slave has been selected and the action requested by the master is performed. Because each slave has a unique address, multiple slave selections in a single request by one master are prevented and each slave can only be accessed by one master at a time. In the case where multiple masters are making requests to the same targeted slave, the PLB core includes an arbiter circuit which determines request priority based on a predetermined priority level or priority scheme.
When a slave is selected by a master, the selected slave will capture the address information sent by the master and the slave will send a status signal back to the PLB core, and hence to the requesting master. In addition, the selected slave will also communicate slave results and other information to the PLB core, and hence, to the master. A status signal from each slave is communicated to the arbiter. These status signals typically include a re-arbitrate request signal, which is a request for a slave to the arbiter to re-arbitrate the bus because the slave was unable to perform the requested function. Status signals also include a wait signal which informs the arbiter to wait for the latching of the incoming address needed for the current command execution before continuing. Status signals also include a write complete signal, which informs the arbiter that the write operation has been completed.
Complications can arise when the data at an address in system memory is not as up-to-date as data in a processor's cache. Consider a situation where a first processor issues a request to read a value from memory. It may occur that a second processor has internally updated that value and stored the updated value in its internal cache. This renders the value in memory old and therefore invalid. Conventionally, the updated value from the second processor is transferred to the first processor in two steps: first, the updated value from the second processor is copied to memory. Then the valued is copied from memory to the internal cache of the first processor. This takes a relatively long time. There is a need to reduce this memory latency.