I.A. Field of the Invention
This disclosure teaches a methodology for the design of custom system-on-chip communication architecture. Specifically a novel electronic system and a method of designing a communication architecture are disclosed. This Application is concurrently filed with U.S Patent Application No. 09/576,956 by Raghunathan et, al.
I.B. Background of the Invention
The evolution of the System-on-Chip (SOC) paradigm in electronic system design has the potential to offer the designer several benefits, including improvements in system cost, size, performance, power dissipation, and design turn-around-time. The ability to realize this potential depends on how well the designer exploits the customizability offered by the system-on-chip approach. While one dimension of this customizability is manifested in the diversity and configurability of the components that are used to compose the system (e.g., processor and domain-specific cores, peripherals, etc.), another, equally important, aspect is the customizability of the system communication architecture. In order to support the increasing diversity and volume of on-chip communication requirements, while meeting stringent performance constraints and power budgets, communication architectures need to be customized to the target system or application domain in which they are used.
Related work in the fields of system-level design, HW/SW co-design, and networking protocols, have been examined herein to place the disclosed techniques in the context of conventional technologies. A substantial body of work exists in relation to system-level synthesis of application-specific architectures through HW/SW partitioning and mapping of the application tasks onto pre-designed cores and application-specific hardware. For more details, see D. D. Gajski, F Vahid, S. Narayan and J. Gong, Specification and Design of Embedded Systems. Prentice Hall, 1994; G. De Micheli, Synthesis and Optimization Digital Circuits. McGraw-Hill, New York, N.Y., 1994; R. Ernst, J. Henkel, and T. Benner, xe2x80x9cHardware-software cosynthesis for microcontrollers,xe2x80x9d IEEE Design and Test Magazine, pp.64-75, Dec. 1993; T. B. Ismail, M. Abid, and M. Jerraya, xe2x80x9cCOSMOS: A codesign approach for a communicating system,xe2x80x9d in Proc. IEEE International Workshop on Software/Codesign, pp. 17-24, 1994; A. Kalavade and E. Lee, xe2x80x9cA globally critical/locally phase driven algorithm for the constrained hardware software partitioning problem in Proc. IEEE International Workshop on Hardware/Sotware Codesign, pp. 42-48, 1994; P. H. Chou, R. B. Ortega, and G. B. Borriello, xe2x80x98The CHINOOK hardware/software cosynthesis system,xe2x80x9d in Proc. Int. Symp. System Level Synthesis, pp. 22-27, 1995; B. Lin, xe2x80x9cA system design methodology for software/hardware codevelopment of telecommunication network applications,xe2x80x9d in Proc. Design Automation Conf, pp. 672-677, 1996; B. P. Dave, G. Lakshminarayana, and N. K. Jha, xe2x80x9cCOSYN: hardware-software cosynthesis of embedded systems,xe2x80x9d in Proc. Design Automation Conf, pp. 703-708, 1997 and P. Knudsen and J. Madsen, xe2x80x9cIntegrating communication protocol selection with partitioning in hardware/software codesign,xe2x80x9d in Proc. Int. Symp. System Level Synthesis, pp. 111-116, Dec. 1998.
While some of these conventional techniques attempt to consider the impact of communication effects during HW/SW partitioning and mapping, they either assume a fixed communication protocol (e.g., PCI-based buses), or select from a xe2x80x9ccommunication libraryxe2x80x9d of a few alternative protocols. Research on system-level synthesis of communication architectures mostly deals with synthesis of the communication architecture topology, which refers to the manner in which components are structurally connected through dedicated links or shared communication channels (buses). For more details on these architectures, see T. Yen and W. Wolf, xe2x80x9cCommunication synthesis for distributed embedded systems,xe2x80x9d in Proc. Int. Conf. Computer-Aided Design, pp. 288-294, Nov. 1995; J. Daveau, T. B. Ismail, and A. A. Jerraya, xe2x80x9cSynthesis of system-level communication by an allocation based approach,xe2x80x9d in Proc. Int. Symp. System Level Synthesis, pp. 150-155, Sept. 1995; M. Gasteier and M. Glesner, xe2x80x9cBus-based communication synthesis on system level,xe2x80x9d in ACM Trans. Design Automation Electronic Systems, pp. 1-11, Jan. 1999 and R. B. Ortega and G. Borriello, xe2x80x9cCommunication synthesis for distributed embedded systems,xe2x80x9d in Proc. Int. Conf. Computer-Aided Design, pp. 437-444, 1998.
While topology selection is a critical step in communication architecture design, equally important is the design of the protocols used by the channels/buses in the selected topology. For example, the nature of communication traffic generated by the system components may favor the use of a time-slice based bus protocol in some cases, and a static priority based protocol in others. For more details, see xe2x80x9cSonics Integration Architecture, Sonics Inc. (http://www.sonicsinc.com/).xe2x80x9d and On-Chip Bus Development Working Group Specification I Version 1.1.0. VSI Alliance, Aug. 1998. The VSI Alliance on-chip bus working group has recognized that a multitude of bus protocols will be needed to serve the wide range of SOC communication requirements. See On-Chip Bus Development Working Group Specification I Version 1.1.0. VSI Alliance, Aug. 1998. Further, most protocols offer the designer avenues for customization in the form of parameters such as arbitration priorities, transfer block sizes, etc. Choosing appropriate values for these parameters can significantly impact the latency and transfer bandwidth associated with inter-component communication.
Finally, there is a body of work on interface synthesis, which deals with automatically generating efficient hardware a implementations for component-to-bus or component-to-component interfaces. For more details, see G. Borriello and R. H. Katz, xe2x80x9cSynthesis and optimization of interface transducer logic,xe2x80x9d in Proc. Int. Conf Computer Design, Nov. 1987; J. S. Sun and R. W. Brodersen, xe2x80x9cDesign of system interface modules,xe2x80x9d in Proc. Int. Conf. Computer-Aided Design, pp. 478-481, Nov. 1992; P. GutberIet and W. Rosenstiel, xe2x80x9cSpecification of interface components for synchronous data paths,xe2x80x9d in Proc. Int. Symp. System Level Synthesis, pp. 134-139, 1994; S. Narayanan and D. D. Gajski, xe2x80x9cInterfacing incompatible protocols using interface process generation,xe2x80x9d in Proc. Design Automation Conf., pp. 468-473, June 1995; P. Chou, R. B. Ortega, and O. Borriello, xe2x80x9cInterface co-synthesis techniques for embedded systems,xe2x80x9d in Proc. Int. Conf. Computer-Aided Design, pp. 280-287, Nov. 1995; J. Oberg, A. Kumar, and A. Hemani, xe2x80x9cGrammar-based hardware synthesis of data communication protocols,xe2x80x9d in Proc. Int. Symp. System Level Synthesis, pp. 14-19, 1996; R. Passerone, J. A. Rowson, and A. Sangiovanni-Vincentelli, xe2x80x9cAutomatic synthesis of interfaces between incompatible protocols,xe2x80x9d in Proc. Design Automation Conf., pp. 8-13, June 1998 and J. Smith and G. De Micheli, xe2x80x9cAutomated composition of hardware components,xe2x80x9d in Proc. Design Automation Conf., pp. 14-19, June 1998. These techniques address issues in the implementation of specified protocols, and not in the customization of the protocols themselves.
In summary, conventional technologies in the field of system-level design and HW/SW co-design do not adequately address the problem of customizing the protocols used in SOC communication architectures to the needs of the application. Further, in previous research, design of the communication architecture is performed statically using information about the application and its environment (e.g., typical input traces). However, in several applications, the communication bandwidth required by each component, the amount of data it needs to communicate, and the relative xe2x80x9cimportancexe2x80x9d of each communication request, may be subject to significant dynamic variations. In such situations, protocols used in conventional communication architectures may not be capable of adapting the underlying communication topology to meeting the application""s varying needs.
In the field of telecommunications and networking protocol design, a significant body of research has been devoted to the design of protocols to meet diverse quality of service (QoS) parameters such as connection establishment delay and failure probability, throughput, residual error ratio, etc. For details on these parameters, see A. S. Tanenbaum, Computer Networks. Englewood Cliffs, N.J., Prentice Hall, 1989. Sophisticated conventional techniques such as flow and traffic control algorithms have been proposed in that context for adapting the protocol to improve the above-mentioned metrics.
With increasing complexity, system-on-chip communication architectures will need to evolve by drawing upon some of the techniques that have been developed in the context of telecom networks. However, there are significant differences such as, but not limited to, the latency requirements, error tolerance and resilience requirements, which differentiate the problem that is addressed herein and the problems encountered in telecom network protocol design.
In this section, the need for CAT-based communication architectures is demonstrated by showing how the limited flexibility of conventional communication architectures, and their inability to adapt to the varying communication needs of the system components, can lead to significant deterioration in the system""s performance.
Consider the example system shown in FIG. 1 that represents part of the TCP/IP communications protocol used in a network interface card (hereinafter this system is referred to as the TCP system). The system shown in FIG. 1 performs checksum-based encoding (for outgoing packets) and error detection (for incoming packets), and interfaces with the Ethernet controller peripheral (which implements the physical and link layer network protocols). Since packets in the TCP protocol do not contain any notion of quality of service (QoS), the packet data structure has been enhanced to contain a field in the header that indicates a deadline for the packet to be processed. The objective during the implementation of the system is to minimize the number of packets with missed deadlines.
FIG. 1(a) shows the behavior of the TCP system as a set of concurrent communicating tasks or processes. The tasks performed by the TCP system for a packet received by the system from the network are explained herein. The process ether_driver, which represents the Ethernet device driver, reads data from the Ethernet controller peripheral and creates a packet in the shared system memory. Process pkt_queue maintains a queue containing selected information from the packet headers. Process ip_check dequeues packet information from the above-mentioned queue, zeroes out some specific fields in the packet header, and co-ordinates the checksum computation. Process checksum retrieves the packet from the shared memory and computes the checksum value for each packet and returns the value to the ip_check process, which flags an error when appropriate.
FIG. 1(b) shows the system architecture used to implement the TCP system. The ether_driver and pkt_queue processes are mapped to embedded software running on a MIPS R3000 processor, while the ip_check and checksum processes are implemented using dedicated hardware. All communications between the system components are implemented using a shared bus. The protocol used in the shared bus supports static priority based arbitration and DMA-mode transfer. Herein, the term DMA mode transfer is used to refer to the transmission of data in clusters or chunks larger than a single bus word. In static priority based arbitration, each component connected to the bus is assigned a fixed priority. At any time, the arbiter grants the use of the bus to the requesting component with highest priority value. The granularity of these chunks is governed by the value of the DMA size parameter assigned to each component.
The bus arbiter and the bus interfaces of the components together implement the bus protocol. The bus protocol allows the system designer to specify values for various parameters such as the bus priorities and DMA block size for each component, etc.
The performance of the TCP system of FIG. 1 for several distinct values of the bus protocol parameters are analyzed. For this experiment, for ease of explanation, only the bus priority values for each component, with fixed values for the remaining protocol parameters. The system simulation was performed using traces of packets with varying laxities of deadlines. An abstract view of the execution of the TCP system processing four packets (numbered i, i+1, j, j+1) is shown in FIG. 2. The figure indicates the times at which each packet arrives from the network, and the deadline by which it needs to be processed. Note that while the arrival times of the packets are in the order i, i+1, j, j+1, the deadlines are in a different order i+1, i, j, j+1. For the sake of the present illustration, two different bus priority assignments have been focused on (checksum greater than ip_check greater than ether driver and ether driver greater than ip_check greater than checksum). While other priority assignments are not explicitly considered here, it should be clear to a skilled artisan that the arguments presented for one of the above two cases will hold for every other priority assignment.
The first waveform in FIG. 2 represents the execution of the system when the bus priority assignment checksum greater than ip_check greater than ether_driver is used. After the completion of the ether_driver process for packet i, the arbiter receives two conflicting bus access requests: process ip_check requests bus access to process packet i, while ether_driver requests bus access to process packet i+1 (since packet i+1 has already arrived from the network). Based on the priority assignment used, the arbiter gives bus access to process ip_check. This effectively delays the processing of packet i+1 until ip_check and checksum have completed processing packet i. This leads to packet i+1 missing its deadline. Packets j and j+1 do meet their deadlines, as shown in FIG. 2. In general, for any sequence of packets whose deadlines are not in the same order as their arrival times, the priority assignment (checksum greater than ip_check greater than ether_driver) may lead to missed deadlines.
It is attempted to eliminate the problem mentioned above by using a different priority assignment (ether_driver greater than ip_check greater than checksum) for the bus protocol. The execution of the system under the new priority assignment is depicted in the second waveform of FIG. 2. As a result of the new priority assignment, when packet i+1 arrives, process ether_driver is able to process it without waiting for packet i to complete. This results in the deadlines for both packets i and i+1 being met. However, consider packets j and j+1 whose deadlines are in the same order as their arrival times. After process ether_driver completes processing packet j, contention for the shared bus occurs between process ether_driver for packet j+1, and process ip_check for packet j. Based on the chosen priority assignment, the arbiter decides in favor of process ether_driver. This delays the execution of process ip_check and checksum for packet j, leading to the system missing packet j""s deadline.
In summary, each of the two bus priority assignments considered for the TCP system led to missed deadlines. Further, the arguments presented in the previous two paragraphs can be applied to show that for every possible priority assignment, either packet i+1 or packet j will miss its deadline.
The deficiency of the communication architecture that leads to missed deadlines in the TCP example can be summarized as follows. The relative importance of the communication transactions generated by the various system components (ether_driver, ip_check, and checksum) varies depending on the deadlines of the packets they are processing. In general, the importance or criticality of each communication transaction may depend on several factors that together determine whether the communication will be on the system""s critical path. The communication architecture needs to be able to discern between more critical and less critical communication requests and serve them accordingly.
As shown in the TCP example, conventional communication architectures suffer from the at least the following drawbacks: (i) the degree of customizability offered may be insufficient in systems with stringent performance requirements, and (ii) they are typically not capable of sensing and adapting to the varying communication needs of the system and the varying nature of the data being communicated.
In this disclosure, a general methodology for the design of custom system-on-chip communication architectures, which are flexible and capable of adapting to varying communication needs of the system components is presented. The disclosed technique can be used to optimize any underlying communication architecture topology by rendering it capable of adapting to the changing communication needs of the components connected to it. For example, more critical data may be handled differently, leading to lower communication latencies. This results in significant improvements in various quality of service (QoS) metrics, including the overall system performance, observed communication bandwidth and bus utilization, and the system""s ability to meet critical deadlines.
The present technique is based on the addition of a layer of circuitry, called the Communication Architecture Tuner (CAT), to each component. The CAT monitors and analyzes the internal state of, and communication transactions generated by, a system component and xe2x80x9cpredictsxe2x80x9d the relative importance of communication transactions in terms of their impact on different system-level performance metrics. The results of the analysis are used by the CAT to configure the parameters of the underlying communication architecture to best suit the component""s changing communication needs.
To meet the objects of the invention there is provided An electronic system with a plurality of components interconnected by a plurality of shared communication channels, wherein at least one component comprises a communication architecture tuner, wherein said tuner enables the electronic system to adapt to changing communication needs of the electronic system.
Another aspect of the present invention is an electronic system with a plurality of components interconnected by a plurality of shared communication channels, wherein an underlying communication architecture comprises a layer of circuitry with at least one communication architecture tuner wherein said tuner enables the electronic system to adapt to changing communication needs of the electronic system.
Preferably the communication architecture tuner is partially implemented in software.
Preferably the communication architecture tuner further comprises at least one partition detector which detects the number of partitions and the conditions that must be satisfied by a communication transaction to classify the transaction under a partitions and at least one parameter generation circuit that computes values for communication protocol parameters based in a partition ID generated by the partition detector.