Systems on silicon show a continuous increase in complexity due to the ever increasing need for implementing new features and improvements of existing functions. This is enabled by the increasing density with which components can be integrated on an integrated circuit. At the same time the clock speed at which circuits are operated tends to increase too. The higher clock speed in combination with the increased density of components has reduced the area which can operate synchronously within the same clock domain. This has created the need for a modular approach. According to such an approach the processing system comprises a plurality of relatively independent, complex modules. In conventional processing systems the systems modules usually communicate to each other via a bus. As the number of modules increases however, this way of communication is no longer practical for the following reasons. On the one hand the large number of modules forms a too high bus load. On the other hand the bus forms a communication bottleneck as it enables only one device to send data to the bus.
A communication network forms an effective way to overcome these disadvantages. Networks on chip (NoC) have received considerable attention recently as a solution to the interconnect problem in highly-complex chips. The reason is twofold. First, NoCs help resolve the electrical problems in new deep-submicron technologies, as they structure and manage global wires. At the same time they share wires, lowering their number and increasing their utilization. NoCs can also be energy efficient and reliable and are scalable compared to buses. Second, NoCs also decouple computation from communication, which is essential in managing the design of billion-transistor chips. NoCs achieve this decoupling because they are traditionally designed using protocol stacks, which provide well-defined interfaces separating communication service usage from service implementation.
Using networks for on-chip communication when designing systems on chip (SoC), however, raises a number of new issues that must be taken into account. This is because, in contrast to existing on-chip interconnects (e.g., buses, switches, or point-to-point wires), where the communicating modules are directly connected, in a NoC the modules communicate remotely via network nodes. As a result, interconnect arbitration changes from centralized to distributed, and issues like out-of order transactions, higher latencies, and end-to-end flow control must be handled either by the intellectual property block (IP) or by the network.
Most of these topics have been already the subject of research in the field of local and wide area networks (computer networks) and as an interconnect for parallel machine interconnect networks. Both are very much related to on-chip networks, and many of the results in those fields are also applicable on chip. However, NoC's premises are different from off-chip networks, and, therefore, most of the network design choices must be reevaluated. On-chip networks have different properties (e.g., tighter link synchronization) and constraints (e.g., higher memory cost) leading to different design choices, which ultimately affect the network services.
NoCs differ from off-chip networks mainly in their constraints and synchronization. Typically, resource constraints are tighter on chip than off chip. Storage (i.e., memory) and computation resources are relatively more expensive, whereas the number of point-to-point links is larger on chip than off chip. Storage is expensive, because general-purpose on-chip memory, such as RAMs, occupy a large area. Having the memory distributed in the network components in relatively small sizes is even worse, as the overhead area in the memory then becomes dominant.
For on-chip networks computation too comes at a relatively high cost compared to off-chip networks. An off-chip network interface usually contains a dedicated processor to implement the protocol stack up to network layer or even higher, to relieve the host processor from the communication processing. Including a dedicated processor in a network interface is not feasible on chip, as the size of the network interface will become comparable to or larger than the IP to be connected to the network. Moreover, running the protocol stack on the IP itself may also be not feasible, because often these IPs have one dedicated function only, and do not have the capabilities to run a network protocol stack.
The number of wires and pins to connect network components is an order of magnitude larger on chip than off chip. If they are not used massively for other purposes than NoC communication, they allow wide point-to-point interconnects (e.g., 300-bit links). This is not possible off-chip, where links are relatively narrower: 8-16 bits.
Introducing networks as on-chip interconnects radically changes the communication when compared to direct interconnects, such as buses or switches. This is because of the multi-hop nature of a network, where communication modules are not directly connected, but separated by one or more network nodes. This is in contrast with the prevalent existing interconnects (i.e., buses) where modules are directly connected. The implications of this change reside in the arbitration (which must change from centralized to distributed), and in the communication properties (e.g., ordering, or flow control).
Modern on-chip communication protocols (e.g., Device Transaction Level DTL, Open Core Protocol OCP, and AXI-Protocol) operate on a split and pipelined basis with transactions consisting of a request and a response and the bus is released for use by others after a request issued by a master is accepted by a corresponding slave. Examples of transactions may include e.g., write+write data as a request, and read as request+read data as response. Split pipelined communication protocols are used especially in multi-hop interconnects (e.g., networks on chip, or buses with bridges), allowing an efficient utilization of the interconnect. The efficiently of a split bus can be increased for cases where a response generation at the slave takes is time consuming. On a pipelined protocol, a master is allowed to have multiple outstanding requests (i.e., requests for which the response is pending or expected).
The above mentioned protocols are designed to operate at a device level, as opposed to a system or interconnect level. In other words they are designed to be independent of the actual interconnect implementation (e.g., arbitration signals are not visible) allowing the reuse of intellectual property blocks IP and their earlier integration. In addition, these communication protocols are designed to ensure that an IP block can communicate “naturally” (e.g., word width and burst sizes are configurable to suit the device rather than a bus).
Some of these protocols (e.g., DTL) include the option to abort transactions that has already been accepted by the target. In the most general sense, a transaction that has been aborted is not executed anymore and it has no effects on the target. In DTL, the semantics is that any outstanding transaction can be attempted to be aborted.
However, aborting transaction in a device-level split pipelined protocol is difficult, because the transaction may pass several intermediate modules (e.g., bridges, adapters) until it reaches its final destination. Accordingly, it may not be possible to stop the transaction. This is especially acute in multi-hop interconnects such as networks on chip and busses with bridges.
Known abort techniques can have ambiguous semantics, or may leave targets in one of more possible states (e.g., the result of trying to abort a write may succeed or not, and as a result the location addressed by the write may contain the old value or the value carried by the write).
It is therefore an object in the invention to provide an improved transaction abortion in a transaction based communication environment.
Therefore, an integrated circuit having a plurality of processing modules and an interconnect for coupling said plurality of processing modules and for enabling a device-level communication based on transactions between said plurality of processing modules is provided. At least one first processing module issues at least one transaction towards at least one second processing module. Said integrated circuit comprise at least one transaction abortion unit for aborting said at least one transaction issued from said first module by receiving an abort request issued by said first module, by initiating a discard of said at least one transaction to be aborted, and by issuing a response indicating the success/failure of the requested transaction abortion.
Aborting transactions is a desirable property for a communication protocol, as it will allow the offload of the interconnect and the slave when a transaction is no longer needed (e.g., data to be sent is too late to be processed, or read data is not useful anymore because some deadline has passed). The advantage of the abort transaction is that it allows the master to get insight of the state of the system after an abort operation. This could allow a more extensive use of the abort transaction with the result of a more efficient use of the interconnect and of the slaves. Here, the transaction abortion unit may be implemented in the interconnect means, the slave or in the master module.
According to an aspect of the invention said integrated circuit comprises at least one network interface associated to one of said plurality of processing modules for controlling the communication between said one of said plurality of processing modules and said interconnect. Said at least one transaction abortion unit is arranged in one of said network interfaces. By associating the transaction abortion unit to the network interface, i.e. close to the module issuing the abort, the modules can continue with their dedicated operations without having to deal with the actual abort communication.
According to a further aspect of the invention said at least one transaction abortion unit is adapted to perform the at least one transaction abortion atomically, i.e. either the complete set of transactions is aborted or none of them, or partially, i.e. as many transactions as possible are aborted, however there may be transactions that are not aborted.
According to a further aspect of the invention said at least one network interface comprises a request buffer for buffering received data and/or a response buffer for buffering outgoing data, and issues a discard for said at least one transaction to be aborted as stored in said request buffer or in said response buffer. Discarding the data in the request/response buffer is an effective way to disclose of the request/response to be aborted.
According to a preferred aspect of the invention said request for said transaction abortion specifies which transactions are to be aborted, and said response issued by said transaction abortion unit specifies which of the requested at least one transaction have been aborted. With such a specific response the master will exactly know the states of all slaves with which it communicates.
The invention is also related to a method for transaction abortion in an integrated circuit having a plurality of processing modules and an interconnect means for coupling said plurality of processing modules and for enabling a device-level communication based on transactions between said plurality of processing modules. At least one first processing module issues at least one transaction towards at least one second processing module. Said at least one transaction issued from said first module is aborted by receiving an abort request issued by said first module, by initiating a discard of said at least one transaction to be aborted, and by issuing a response indicating the success of the requested transaction abortion.
The invention further relates to a data processing system having a plurality of processing modules and an interconnect for coupling said plurality of processing modules and for enabling a device-level communication based on transactions between said plurality of processing modules is provided. At least one first processing module issues at least one transaction towards at least one second processing module. Said integrated circuit comprises at least one transaction abortion unit for aborting at least one transaction issued from said first module by receiving an abort request issued by said first module, by initiating a discard of said at least one transaction to be aborted, and by issuing a response indicating the success/failure of the requested transaction abortion.
Therefore, the transaction abort may also be implemented in a system comprising several separate integrated circuits or multi-chip networks.
The invention is based on the idea to introduce a special abort transaction for attempting to abort transactions. As the abort may succeed or not, a response is required to describe the success/failure of the abort transaction. When a response on the result of the abort is issued, the master issuing the abort transaction will know precisely the resulting state of the slave or the slave environment. A transaction is either aborted completely or not at all, such that slaves do not end up in an intermediate state, or with a partial result.
Further aspects of the invention are described in the dependent claims.
These and other aspects of the invention are apparent from and will be elucidated with reference to the embodiment(s) described hereinafter.