1. Technical Field
The present disclosure relates to a communication system for interfacing a plurality of transmission circuits with an interconnection network. Embodiments have been developed with particular attention paid to possible use in communication interfaces that are typically used for transmission of the DMA (Direct Memory Access) type.
2. Description of the Related Art
Systems within an integrated circuit (Systems-on-Chip—SoCs) and systems in a single package (Systems-in-Package—SiPs) typically comprise a plurality of circuits that communicate with one another via a shared communication channel. For instance, the aforesaid communication channel may be a bus or a communication network, for example a Network-on-Chip (NoC) or Network-in-Package (NiP), and is frequently referred to as “interconnection network” (ICN).
For instance, the above SoCs are frequently used for processors designed for mobile or multimedia applications, for example smartphones, set-top boxes, or routers for domestic uses.
FIG. 1 shows an example of a typical SoC 1.
In the example considered, the system comprises a processor 10 and one or more memories 20. For instance, illustrated in the example considered are a small internal memory 20a, for example a RAM (Random-Access Memory), a non-volatile memory 20b, for example a flash memory, and a communication interface 20c for an external memory, for example a DDR memory.
In the example considered, the system also comprises interface circuits 30, for example input and output (I/O) ports, a UART (Universal Asynchronous Receiver-Transmitter) interface, an SPI (Serial Peripheral Interface) interface, a USB (Universal Serial Bus) interface, and/or other digital and/or analog communication interfaces.
In the example considered, the system also further comprises peripherals 40, for example comparators, timers, analog-to-digital or digital-to-analog converters, etc.
In the example considered, the aforesaid modules, i.e., blocks 10, 20, 30 and 40, are connected together through a interconnection network 70, i.e., an interconnection network, for example a bus or preferably a Network-On-Chip (NoC).
The general architecture described previously is frequently used for conventional micro-controllers, which renders any detailed description here superfluous. Basically, this architecture enables interfacing of the processor 10 with the various blocks 20, 30 and 40 via software commands that are executed by means of the processor 10.
In multimedia or mobile processors other blocks 50 are added to the above generic architecture, which will be referred to hereinafter as Intellectual Property (IP) circuits. For instance, the aforesaid IP blocks 50 may comprise an image or video encoder or decoder 50a, an encoder or decoder of audio signals 50b, a WiFi communication interface 50c, or in general blocks, the hardware structure of which is optimized for implementation of functions that depend upon the particular application of the system. The aforesaid blocks may even be autonomous and interface directly with the other blocks of the system, for example the memories 20 and the other peripherals 30 and 40.
Typically, associated to each IP block 50 is a respective communication interface 80 configured for exchanging data between the IP block 50 and the interconnection network 70.
For instance, FIG. 2 shows a block diagram of a typical communication interface 80 for an IP block 50.
In the example considered, the communication interface 80 comprises:                a transmission memory 802a for temporary saving output data, i.e., the data coming from the respective IP block 50;        a reception memory 802b for temporary saving input data, i.e., the data coming from the interconnection network 70;        an interface 804 for exchanging data between the memories 802a, 802b and the interconnection network 70, for example for sending the data saved in the transmission memory 802a to the interconnection network 70 and saving the data received from the interconnection network 70 in the reception memory 802b; and        a control circuit 806, which, for example, controls the flow of data between the IP block 50 and the interconnection network 70, monitors the state of the memories 802a and 802b, and generates the control signals for the IP block 50.        
Typically, the reception memory 802b is a FIFO (First-In/First-Out) memory. However, in the case where the data received may be out of order, the reception memory 802b or the interface 804 may also re-order the data before they are written in the reception memory 802b. 
In the example considered, no interface is illustrated for exchange data between the IP block 50 and the memories 802a and 802b, because typically the IP block 50 is able to exchange the data directly with the memories 802a and 802b, for example by exploiting the control signals generated by the control circuit 806.
For instance, FIGS. 3a and 3b show a scenario of a typical data flow. In particular, FIG. 3a is a block diagram that shows the data flow of a typical transmission of data, and FIG. 3b is a flowchart that shows the respective transmission steps.
After an initial step 1000, the processor 10 sends, in a step 1002, an instruction to the block 50a indicating that the memory 20a contains data for the block 50a. For instance, for this purpose, the processor 10 may send to the block 50a an instruction indicating a start address and an end address within the memory 20a (or else a start address and the length of the transfer). Alternatively, the processor 10 could configure the aforesaid area by writing the start address and the end address directly in a configuration register of the block 50.
Next, in a step 1004, the block 50a reads the data from the memory 20a by means of the respective communication interface 80a. In particular, typically, the communication interface 80a sends for this purpose to the memory 20a a read request, and the memory 20a sends to the communication interface 80a the data requested. For instance, typically both the read request and the response are sent through the interconnection network 70 via data packets.
Finally, once all the data have been read, the block 50a or the communication interface 80a generates, in a step 1006, an interrupt that signals to the processor 10 the fact that the transmission has been completed.
Next, the processor 10 can allocate, in a step 1008, the respective area of the memory 20a to another process, and the procedure terminates in a step 1010.
Consequently, typically the blocks 50 access the memory 20 by means of a Direct Memory Access (DMA), i.e., the blocks 50 access the memory directly without any intervention on the part of the processor 10.
Typically, the aforesaid DMAs may be of two types: a data-write request or a data-read request. The read and write DMA transfers are substantially identical except for the data:                in the case of a write request, the data are sent by the IP block 50 that has requested the DMA; and        in the case of a read request, the data are sent by the destination block that receives the read request.        
Both of the requests are typically characterized either by a start address and an end address from which data is to be read/written or by a start address and a length of the transfer.
For instance, the above address can comprise the address of a node of a NoC, the memory address within the destination (for example, in the case of a memory), or a combination of both. Consequently, both the write requests and the read requests are typically accompanied by a start address that identifies the addressee of the request, and the aforesaid address may belong to the memory map of the system. In this case, the interconnection network 70 decodes the address received and identifies the addressee that is to receive or supply the data and conveys appropriately the replies that it receives from the addressee to the source of the communication.
Furthermore, the various blocks of the system 1 may also simultaneously access the interconnection network 70.
For instance, the blocks 10 and 50 are typically the communication sources (initiators), which request DMA transfers (both writing and reading transfers) in competition with one another, where each could even present a plurality of channels. Instead, the blocks 20, 30 and 40 are typically addressees, which receive or send data in accordance with the requests.
For this reason, there may exist simultaneously a number of DMA communication channels, which, once converted into the protocol of the interconnection network 70, are to be transmitted through the interconnection network 70 itself.
FIG. 4 shows an example of a typical solution that can be used for transmission of a plurality of DMA communications coming from respective circuits designated as a whole by 90. For instance, the circuits 90 may be the processor 10 and/or an IP block that sends a data-read request or a data-write request.
Typically, each transmission circuit 90 has associated to it an interface circuit 92 that converts the DMA transmission coming from the respective circuit 90 into a communication that uses the protocol of the interconnection network 70; i.e., the interface circuit 92 makes a conversion between the transport layer and the link layer. For instance, the blocks 10-40 are typically optimized for a given architecture, and the interface circuit 92 is directly integrated in the respective block. Instead, the IP blocks 50 are typically not optimized for a specific communication protocol, and consequently an additional interface is frequently required (see, for example, the blocks 80 in FIG. 1 or FIG. 3a). For instance, as mentioned previously, the interface circuit 92 could segment the DMA communication and add respective headers for forming data packets that can be forwarded to the destination through the interconnection network 70.
Frequently, different interface circuits 92 have to transmit data simultaneously. For this reason, the interconnection network 70 typically has associated a circuit 94 that regulates access to the interconnection network 70, which is typically referred to as arbiter, planner, or scheduler. For this reason, the interface circuit 92 typically comprises a memory (see FIG. 2) for temporarily saving the data coming from the respective circuit to render the operation of the respective circuit 90 independent of possible delays in the transmission of the data over the interconnection network 70.
Typically, the arbiter 94 is directly integrated in the interconnection network 70 and could be, for example, a router node of a NoC. In fact, in general, in the solution illustrated in FIG. 4, also the arbiter 94 uses the protocol of the interconnection network 70 and can, for example, analyze the header of the various packets for determining the priority of the transmissions in such a way as to guarantee a certain quality of service (QoS).
Consequently, in general, numerous DMA channels may exist simultaneously, and hence the requirements of performance, for example the efficiency of access to the memories 20, the latency of the communications, and the bandwidth, are frequently difficult to achieve. Furthermore, the occupation of area of silicon and the energy consumption are other constraints that today represent a fundamental added value for integrated circuits.