Typically, system-on-chip, or Ultra Large Scale Integration (ULSI), designs which employ multiple masters and slaves and which employs a traditional processor local bus (PLB) interconnect architecture operate in the following manner. Referring to FIG. 1, a typical ULSI design 500 is provided which includes at least one master 502, a PLB core 504 and a plurality of slaves 506. Each master 502 is communicated to the PLB core 504 via at least one dedicated port or line 508. The multiple slaves in turn, are connected to the PLB core 504 via a PLB shared data bus 510 and a command bus 512 allowing each master to communicate with each slave connected to the PLB shared data bus 510 and the command bus 512. Each slave has a unique slave ID, or identifier code, which allows a master 502 to select and communicate with a particular slave 514 within the plurality of slaves 506. When a master 502 wants to communicate with the particular slave 514, the master 502 is required to send certain information to the PLB core 504 for distribution to the slaves 506. The slaves 506 then take this information and examine it for the slave ID. An example of this information is the selected bus command (CMD), the write_data command and the address (Addr) which contains the desired slave ID. If the slave ID sent by the master 502 matches the predetermined slave ID of a slave 514, then that slave 514 has been selected and the action requested by the master 502 is performed. Because each slave 514 has a unique slave ID, multiple slave selections by one master 502 are prevented and each slave 506 can only be accessed by one master 502 at one time. In the case where multiple masters 502 are making requests to a targeted slave 514, the PLB core 504 typically includes an arbiter circuit 516 which determines request priority based on a predetermined priority level or priority scheme.
The selected slave 514 will then gate in the information sent by the master 502. If the slave 514 is ready to process this request, the Addr information will be latched and the slave 514 will send a status signal back to the PLB core 504, and hence to the requesting master 502, via a type one dedicated line 518. In addition, the selected slave 514 will also communicate slave results and other information to the PLB core 504 via a gated OR circuit 524, and hence the master 502, via a type two shared status bus 520. Lastly, a status signal from all of the slaves 506 will be OR'ed together using a gated OR circuit 524 and this information will be communicated to the arbiter 516 via a type three shared status bus 522. These status signals typically include a re-arbitrate request signal which is the slave 514 requesting the arbiter 516 to re-arbitrate the bus because the slave 514 was unable to perform the requested function, a wait signal which informs the arbiter 516 to wait for the latching of the incoming address needed for the current command execution before continuing and a write complete signal, which informs the arbiter 516 that the write operation has been completed.
As indicated by the above discussion, traditional PLB interconnect architecture allows a master 502, such as a microprocessor or a system code server, to write code to only one of many slaves 506, such as main memory, at any one time. In the case where the same data has to be written in multiple places, this sequential write scheme increases processing time and impedes system efficiency. This is because typically data is written to the Level-3 (L3) cache in order to condition the system and not to the Level-2 (L2) cache. However, if the processor examines the L2 cache and the desired data is not within the L2 cache, the processor then obtains the data from the L3 cache and updates the L2 cache. This process takes time and impedes system efficiency. Although this is sufficient for most systems that can tolerate sequential write operations, this is not desirable for system-on-chip systems having an embedded processor core with an L2 and L3 cache as one of the PLB masters.
In system-on-chip, or ULSI, designs that employ an embedded processor core as one of its PLB masters, wherein the processor has an L2 cache, it is desirable for the system code server to be able to write to multiple slaves, such as L2 and L3 cache, at the same time. One advantage which a multiple slave write capability provides would be to allow the processor to obtain the desired data faster than a traditional design having a single write capability, thus allowing the processor to expedite its processing time and to use its L2 cache more effectively.
The need remains for a slave design, and a method for using the slave design, which incorporates all of the performance characteristics of current slave designs, yet provides for the capability to select and communicate with multiple slaves, as a group or as individuals, simultaneously.