Network-on-chip (NoC) is an increasingly popular technology for connecting the heterogeneous IPs within system-on-chip (SoC) integrated circuits. IPs have one or more master and/or slave bus interfaces, typically using industry standard protocols such as AMBA and OCP. Masters initiate read and write transactions, also referred to as bursts, through an initiator interface. Transactions issued using those protocols may have their requests responded to in any order unless the requests have the same ID, in which case they must be responded to in order. IDs are known as “tags” in the OCP protocol and “AIDs” in the AMBA AXI protocol
The NoC transports the transactions to the connected target interfaces, which connect to the slaves. Some NoCs use a transport protocol that is different from the bus interfaces used to request transactions. Transactions are a communication layer above the transport. The transport of a transaction is done with one or more atomic transport packets.
A logic module called a network interface unit (NIU) converts between transactions at IP interfaces and transport protocol packets. An initiator NIU accepts request transactions and gives transaction responses to masters. A target NIU gives transaction requests and accepts transaction responses from slaves. The connectivity of initiator NIU and target NIU as well as intermediate modules such as routers, muxes, switches, and buffers are referred to as the topology of the NoC. The position of each interface or module within the NoC is referred to as its logical location within the topology.
The IPs that comprise the SoC are each constrained to a physical location within the chip by constraints such as form factor, the location of IO pads, power supply nets, and power down regions. The set of all location constraints in the chip are referred to as the floorplan. The initiator and target NIUs and all interconnected modules of the logical topology are distributed throughout the floorplan of the chip.
The target of a transaction is selected by the address given by the initiator. Within the chip, it is common that routes of data transfer exist for only some combinations of initiators and target. Furthermore, the specific range of addresses mapped to each target might be different for different initiators. The set of target address mappings for an initiator is referred to as the address space of the initiator.
The address of the transaction given by the initiator also specifies at which data byte, within the address range of the selected target, the transaction begins. The amount of data to access is also given by the initiator. That determines the range of data accessed from the beginning to the end of the transaction.
To complete the transaction the target interface gives a response. For a read transaction the response carries the data from the target to the initiator. For a write transaction the response simply carries an acknowledgement.
FIG. 1 shows an address space of an initiator. A first target is accessed within the range of mapping A and a second target is accessed within the adjacent range of mapping B. Mapping A and mapping B are adjacent and non-overlapping within the address space. A transaction request to an address shortly before the boundary between mapping A and mapping B with a large range will access data from both the first target and the second target. Because packets are atomic, this transaction will be split into at least two packets with at least one sent to the first target and at least one sent to the second target. The two packets will have the same ID as the original request and therefore must be responded to in order. The responses from the two targets are reassembled to complete the transaction. For a write transaction, reassembly is simply ensuring that an acknowledgement is received for each packet.
Such designs suffer two problems. First, the storage capacity of a Re-Order Buffer (ROB) determines the number of transactions that can be simultaneously pending from all initiators to all targets of split transactions. DRAM performance is best when a large number of transactions are pending in the scheduler. The configuration of a single ROB for the entire system, as ROB 204 is shown in FIG. 2, must have a large capacity and therefore consume a large area within the chip floorplan. Most advanced systems-on-chips (SoCs) have multiple power islands that can be independently powered down to reduce chip power consumption during low power operating modes. However, since most SoCs use DRAM most of the time, the ROB must remain powered up most of the time.
Second, the entire throughput of all DRAMs passes through the physical location of the ROB. Within a reasonable clock speed, a large throughput requires a wide datapath, consisting of a large number of connections. Since all initiators with connectivity to DRAM connect to the ROB, it is a point of physical wire routing congestion. This complicates the place & route and reduces manufacturability of the chip.