Systems on silicon show a continuous increase in complexity due to the ever increasing need for implementing new features and improvements of existing functions. This is enabled by the increasing density with which components can be integrated on an integrated circuit. At the same time the clock speed at which circuits are operated tends to increase too. The higher clock speed in combination with the increased density of components has reduced the area which can operate synchronously within the same clock domain. This has created the need for a modular approach. According to such an approach the processing system comprises a plurality of relatively independent, complex modules. In conventional processing systems the systems modules usually communicate to each other via a bus. As the number of modules increases however, this way of communication is no longer practical for the following reasons. On the one hand the large number of modules forms a too high bus load. On the other hand the bus forms a communication bottleneck as it enables only one device to send data to the bus.
A communication network forms an effective way to overcome these disadvantages. Networks on chip (NoC) have received considerable attention recently as a solution to the interconnect problem in highly-complex chips. The reason is twofold. First, NoCs help resolve the electrical problems in new deep-submicron technologies, as they structure and manage global wires. At the same time they share wires, lowering their number and increasing their utilization. NoCs can also be energy efficient and reliable and are scalable compared to buses. Second, NoCs also decouple computation from communication, which is essential in managing the design of billion-transistor chips. NoCs achieve this decoupling because they are traditionally designed using protocol stacks, which provide well-defined interfaces separating communication service usage from service implementation.
Using networks for on-chip communication when designing systems on chip (SoC), however, raises a number of new issues that must be taken into account. This is because, in contrast to existing on-chip interconnects (e.g., buses, switches, or point-to-point wires), where the communicating modules are directly connected, in a NoC the modules communicate remotely via network nodes. As a result, interconnect arbitration changes from centralized to distributed, and issues like out-of order transactions, higher latencies, and end-to-end flow control must be handled either by the intellectual property block (IP) or by the network.
Most of these topics have been already the subject of research in the field of local and wide area networks (computer networks) and as an interconnect for parallel machine interconnect networks. Both are very much related to on-chip networks, and many of the results in those fields are also applicable on chip. However, NoC's premises are different from off-chip networks, and, therefore, most of the network design choices must be reevaluated. On-chip networks have different properties (e.g., tighter link synchronization) and constraints (e.g., higher memory cost) leading to different design choices, which ultimately affect the network services.
NoCs differ from off-chip networks mainly in their constraints and synchronization. Typically, resource constraints are tighter on chip than off chip. Storage (i.e., memory) and computation resources are relatively more expensive, whereas the number of point-to-point links is larger on chip than off chip. Storage is expensive, because general purpose on-chip memory, such as RAMs, occupy a large area. Having the memory distributed in the network components in relatively small sizes is even worse, as the overhead area in the memory then becomes dominant.
For on-chip networks computation too comes at a relatively high cost compared to off-chip networks. An off-chip network interface usually contains a dedicated processor to implement the protocol stack up to network layer or even higher, to relieve the host processor from the communication processing. Including a dedicated processor in a network interface is not feasible on chip, as the size of the network interface will become comparable to or larger than the IP to be connected to the network. Moreover, running the protocol stack on the IP itself may also be not feasible, because often these IPs have one dedicated function only, and do not have the capabilities to run a network protocol stack.
Introducing networks as on-chip interconnects radically changes the communication when compared to direct interconnects, such as buses or switches. This is because of the multi-hop nature of a network, where communication modules are not directly connected, but separated by one or more network nodes. This is in contrast with the prevalent existing interconnects (i.e., buses) where modules are directly connected. The implications of this change reside in the arbitration (which must change from centralized to distributed), and in the communication properties (e.g., ordering, or flow control).
Modern on-chip communication protocols (e.g., Device Transaction Level DTL, Open Core Protocol OCP, and AXI-Protocol) operate on a split and pipelined basis with transactions consisting of a request and a response, and the bus is released for use by others after a request issued by a master is accepted by a corresponding slave. Split pipelined communication protocols are used especially in multi-hop interconnects (e.g., networks on chip, or buses with bridges), allowing an efficient utilization of the interconnect The efficiently of a split bus can be increased for cases where a response generation at the slave takes is time consuming. On a pipelined protocol, a master is allowed to have multiple outstanding requests (i.e., requests for which the response is pending or expected).
The above mentioned protocols are designed to operate at a device level, as opposed to a system or interconnect level. In other words they are designed to be independent of the actual interconnect implementation (e.g., arbitration signals are not visible) allowing the reuse of intellectual property blocks IP and their earlier integration. In addition, these communication protocols are designed to ensure that an IP block can communicate “naturally” (e.g., word width and burst sizes are configurable to suit the device rather than a bus).
Some of these protocols like DTL include a function to retract transactions as an additional protocol feature. An issued transaction can only be retracted without causing any change in the state of a slave when the transaction has not been accepted yet by a slave.
Transaction retraction is usually implemented by invalidating the command signals. However, in order to avoid that the slave is left in an incorrect state due to the transaction retraction, the slave is enforced by the protocol to process the transaction in merely one cycle. This can be especially difficult in a system with a high clock rate. Furthermore, the retraction of a command may not be possible, when address signals (i.e., command, address, and other command parameters, such as burst length) are independent of the write data signals. For example, when write data for a command has already been (partially) accepted by a slave and sent further before the actual write command itself is accepted (as implemented in the AXI protocol), the write command cannot be retracted, since there may be no way to seize and remove the already sent write data. Accordingly, the AXI protocol does not even allow transaction retraction.
It is therefore an object in the invention to provide an improved transaction retraction in a transaction based communication environment.
This object is achieved by an integrated circuit according to claim 1, a method for transaction retraction according to claim 7, and a data processing system according to claim 8.
Therefore, an integrated circuit having a plurality of processing modules I, T is provided. At least one first processing module I issues at least one transaction towards at least one second processing module T. Said integrated circuit further comprises at least one first transaction retraction unit TRU1 for indicating an allowance to said at least one first of said processing modules I to retract said at least one transaction according to the state of said second processing module T.
The proposed integrated circuit allows transaction retraction in a controlled way, whereby the possibility of inconsistent states in the target is avoided, and unnecessary constraints on targets are eliminated. The ability to actually perform an effective transaction retraction is a desirable property, since the load on the interconnect can be reduced when a transaction is not needed anymore (e.g., data to be sent is too late to be processed, or read data is not useful anymore because some deadline has passed). Additionally, the time period is reduced during which a process is blocked as it is waiting on the interconnect acceptance of/response to a transaction which is no longer needed anymore.
According to an aspect of the invention said integrated circuit comprise at least one first transaction retraction unit, which is associated to said at least one second processing module. Hence, the decision to allow the retraction is performed on the target side and not on the side of an initiator. As the transaction retraction may be implemented by the transaction retraction units, the arrangement of the processing modules do not need to be changed. Therefore, the proposed scheme is simple and easy to implement, and provides backward compatibility with existing protocols.
According to a further aspect of the invention at least one second transaction retraction unit, which is associated to said first processing module, is provided to issue an explicit transaction retraction request to said first transaction retraction unit or said second processing unit. Said first transaction retraction unit indicates an allowance of said transaction retraction request. Here, the retraction is explicitly requested and explicitly allowed.
According to still a further aspect of the invention said first transaction retraction unit TRU1 indicates the allowance of said transaction retraction request rt, if the transaction retraction request rt is present.
According to a further aspect of the invention, at least one second transaction retraction unit TRU2 is associated to said first processing module for issuing an explicit transaction retraction request rt to said first transaction retraction unit TRU1 or said second processing module T. Said first transaction retraction unit TRU1 indicates an allowance of said transaction retraction request rt, if a valid command CMD issued from said first processing module I is present, the valid command CMD has not been accepted by the second processing module T, and the transaction retraction request rt is present.
According to a further aspect of the invention said first retraction unit indicates an allowance of a requested transaction retraction if a valid command issued from said first processing module is present, and the valid command has not been accepted yet by the second processing module. Here, the allowance of the retraction is performed implicitly.
The invention is also related to a method for transaction retraction in an integrated circuit having a plurality of processing modules I, T. At least one transaction is issued by at least one first processing module I towards at least one second processing module T. The allowance to retract said at least one transaction according to the state of said second processing module (T) is indicated to said at least one first of said processing modules I.
The invention further relates to a data processing system having a plurality of processing modules I, T. At least one first processing module I issues at least one transaction towards at least one second processing module T. Said integrated circuit further comprises at least one first transaction retraction unit TRU1 for indicating an allowance to said at least one first of said processing modules I to retract said at least one transaction according to the state of said second processing module T.
Accordingly, the transaction retraction may also be performed in a multi-chip network or a system with several separate integrated circuits.
The invention is based on the idea to extend the handshake process for a transaction retraction by a special signal or a special combination of signals to grant or refuse a transaction retraction. A transaction retraction is granted when the state in the slave permits this, otherwise it is refused (e.g., a write retraction can be granted when no write data from that transaction has been sent further).
Further aspects of the invention are described in the dependent claims.
These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiment(s) described hereinafter.