1. Technical Field
The present invention relates generally to computer systems and in particular to a computer system designed as a system on a chip (SoC). Still more particularly, the present invention relates to a method and system for providing a SoC with bus architecture that supports sequences with varying latency and/or frequency requirements.
2. Description of the Related Art
The computer industry has made significant developments in integrated circuit (IC) technology in recent years. For example, ASIC (application specific integrated circuit) technology has evolved from a chip-set philosophy to an embedded core based system-on-a-chip (SoC) concept. The system-on-a-chip concept refers to a system in which, ideally, all the necessary integrated circuits are fabricated on a single die or substrate. An SoC IC includes various reusable functional blocks, such as microprocessors, interfaces (e.g., external bus interface), memory arrays, and DSPs (digital signal processors). Such pre-designed functional blocks are commonly called “cores”.
With a SoC, processed requests are sent from a core referred to as an initiator to a target (which may also be a core). An initiator (or master or busmaster as it is sometimes called) is any device capable of generating a request and placing that request on the bus to be transmitted to a target. Thus, for example, either a processor or DMA controller may be an initiator. Targets (or slaves) are the receiving component that receives the initiator-issued requests and responds according to set protocols.
In order to complete the connections between initiators and targets, the SoC includes an on-chip bus utilized to connect multiple initiators and targets. The system bus consists of an interface to the initiators and a separate interface to the targets and logic between the interfaces. The logic between the interfaces is called a “bus controller”. This configuration is typical among system-on-a-chip (SoC) buses, where all the initiators, targets and the bus controller are on the same chip (die).
One example of the bus utilized by SoC computers systems is the CoreConnect™ processor local bus (PLB). (CoreConnect™ is a registered trademark of International Business Machines). In an SoC with a PLB architecture, each device attaches to a central resource called the “PLB Macro”. The “PLB Macro” is a block of logic that acts as the bus controller, interconnecting all the devices (including initiators and targets) of the SoC. PLB Macro primarily includes arbitration function, routing logic, buffering and registering logic. The devices communicate over the bus via a (PLB) protocol in a synchronous manner. The protocol includes rules that control how transmission processes are to be completed, including, for example, the number of clocks (system clock cycles) taken to perform certain sequences. Among these sequences are (1) the time from request at the initiating device to snoop result at the initiating device, and (2) the time from read data at the source device (the target) to read data at the destination device (the initiator), etc.
SoC fabrication involves various design considerations that enables differentiation among the resulting chips. Each chip is designed/fabricated with a set of devices, which may be different from (or similar to) the devices utilized by another chip. When each chip has a unique set of devices, the resulting chip/die sizes are different. Furthermore, chips may be built from a variety of chip technologies, which have different timing characteristics.
The time for a signal to propagate across a chip depends on the “distance” the signal must travel and the characteristics of the chip technology. As utilized herein, the term “distance” is a generalized term describing the combined effects of actual wire distance, wire dimensions, net capacitance, gate characteristics, etc. As a consequence, the amount of time for a signal to propagate from one device to another (including the time to propagate between a device and the PLB Macro) differs significantly from chip to chip. These inevitable variations in “distance” between devices means that (1) running the bus at a single frequency and (2) operating the protocol sequences at a single latency is not optimal for a variety of chips.
Currently, the simplest method of addressing the above problem is to define a protocol with a fixed set of latencies and then adjust the frequency based on the distances between devices. In this method the various sequences that make up the protocol are actually run at more than one latency. This method is utilized in CoreConnect™ PLB3 and PLB4. However several drawbacks are seen with this method, including:
(1) the devices must be capable of operating over a variety of frequencies. This is often problematic, particularly for devices that attach to other off-chip devices that operate at a fixed frequency;
(2) at lower frequencies, bandwidth and latency are degraded, which results in a loss of performance. The latency loss is the result of sequences taking a fixed number of clocks (ticks or cycles), while the clock ticks are becoming longer; and
(3) the system (collection of devices) is “optimized” for the longest (slowest) path among the devices. Therefore, devices cannot operate at a higher frequency.
A more sophisticated method of addressing the problem involves defining the bus protocol such that protocol sequences are allowed to take a range of number of clock ticks (latencies). During chip integration (i.e., the design process of connecting all the devices on the die), the maximum distances between devices is determined, and the appropriate latencies are set for the corresponding paths.
Often, this technique is utilized such that the latency for all devices is set based on the longest path between any two devices. Thus, even nearby devices utilize the latency associated with the longest path. The CoreConnect PLB3 and PLB4 buses also utilize this technique for the master-request-to-slave-request path. However, this technique is also not optimal for many chips. Paths that are long are set to take multiple clocks for propagation, and this results in the following drawbacks:
(1) bandwidth is degraded because a new sequence cannot begin on each clock;
(2) timing analysis is more difficult to perform when paths require more than one clock for propagation. This is because timing analysis software tools require the operator to identify and specify the number of clocks associated with any path that requires more than one clock, since the default number of clocks is one; and
(3) if all paths are set to a latency based on the single longest path, then devices that are close to one another cannot take advantage of their proximity.
The present invention recognizes the flaws in the two design methods described above and realizes that it would be desirable to provide a SoC designed to optimize the transmission of signals on the bus given the multiplicity of frequencies and latencies of propagation. The invention recognizes that it would be further desirable to provides this feature without requiring degradation in either timing or other parameter of SoC bus operation. These and other benefits are provided by the invention described herein.