1. Field of the Invention
The invention relates generally to packet switches implemented in integrated circuits and more particularly to packet switches that include devices that conform to the RapidIO architecture.
2. Description of Related Technology
Problems in Interconnecting the Subsystems of a Digital System
As the circuits that make up digital devices have gotten smaller and smaller, more and more subsystems have been included in the case that contains the processor. For example, thirty years ago, the case of a minicomputer contained only the processor and the memory. Hard drives and communications gear were separate. The case of a modern laptop PC has perhaps a tenth of the volume of the minicomputer case, but the case typically contains the processor, the memory, a hard drive, a removable drive, a keyboard, a screen, a microphone, a speaker, and various communications systems, including a wireless communications system.
As the number of subsystems in the case has grown, the difficulties of communication between the subsystems have increased. Originally, communication was by means of a single bus that connected the processor and all the peripherals. Driven by the different performance requirements of the peripherals, the single bus quickly became a hierarchy of buses. Subsystems are placed at the appropriate level in the hierarchy according to the performance level they require. Low performance subsystems are placed on lower performance buses, which are bridged to the higher performance buses so as to not burden the higher performance subsystems. Bridging may also be used to deal with legacy interfaces.
The need for higher levels of bus performance is driven by two key factors. First, the need for higher raw data bandwidth to support higher peripheral device performance requirements, second the need for more system concurrency. The overall system bandwidth requirements have also increased because of the increasing use of DMA, smart processor-based peripherals and multiprocessing in systems.
Over the past several years the shared multi-drop bus has been exploited to its full potential. Many techniques have been applied, such as increasing frequency, widening the interface, pipelining transactions, splitting transactions, and allowing out of order completion. Continuing to work with a bus in this manner creates several design issues. Increasing bus width, for example, reduces the maximum achievable frequency due to skew between signals. More signals will also result in more pins on a device, traces on boards and larger connectors, resulting in a higher product cost and a reduction in the number of interfaces a system or device can provide. Worsening the situation is the desire to increase the number of subsystems that can communicate directly with each other. As frequency and width increase, the ability to have more than a few subsystems attached to a shared bus becomes a difficult design challenge. In many cases, system designers have inserted a hierarchy of bridges to reduce the number of loads on a single bus.
Using a High-Speed Switch to Connect Subsystems
A fundamental solution to the problem of using buses to connect subsystems of a digital system is to replace the bus with a very high speed switch. Subsystems of the digital system are connected to the switch and communicate by sending each other packets of information. Each packet contains a destination address, and the switch routes each packet to the destination subsystem. Advantages of using a very high speed switch instead of a bus include the following:                Communication between subsystems is point-to-point. A given subsystem need only deal with packets that have the subsystem as their destination.        Many subsystems can communicate concurrently.        If packets are sent serially, a given subsystem need only have a single bidirectional connection or two unidirectional connections to the switch, greatly reducing the pin count required for the subsystems.        
A standard architecture for interconnecting subsystems with switches is RapidIO™, which is described in overview in the white paper, RapidIO: The Interconnect Architecture for High Performance Embedded Systems, copyrighted in 2003 and available at http://www.rapidio.org/zdata/techwhitepaper_rev3.pdf in 2005. There are two broad classes of devices in the RapidIO architecture: endpoints and switches. Endpoints have addresses in a RapidIO network and source and sink transactions to and from a RapidIO network. Switches are responsible for routing packets across the RapidIO network from the source to the destination using the information in the packet header without modifying the logical layer or transport layer contents of the packet. Switches do not have addresses in the RapidIO network. Switches also determine the topology of the RapidIO network and play an important role in the overall performance of RapidIO. Some RapidIO devices can act both as endpoints and switches. Both switches and endpoints support maintenance transactions which give users of the network access to architectural registers in the devices.
FIG. 1 shows a RapidIO network 101 and an example RapidIO packet 115. RapidIO network 101 includes two four-port RapidIO switch devices 113(0 and 1) and 6 RapidIO endpoint devices 107(0 . . . 5). Endpoints 107(0,2, and 4) are connected to switch 113(0), while endpoints 107(1,3, and 5) are connected to switch 113(1). Endpoint 107(3) is further connected to ROM 111 in subsystem 109 and endpoint 107(0) is a part of a CPU 103 which has access to DRAM memory 105. An example of what can be done in network 101 is the following: CPU endpoint 107(0) can make a RapidIO packet whose destination is endpoint 107(3) and which specifies that endpoint 107(3) is to read data from a location in ROM 111 and return the data to endpoint 107(0). When endpoint 107(0) places the packet on a connection to port 2 of switch 113(0), switch 113(0) routes the packet to switch 113(1), which then routes it to endpoint 107(3). Endpoint 107(3) then does the read operation and makes a return packet, which it outputs to switch 113(1), which in turn routes the packet to switch 113(0), which then routes it to endpoint 107(0). At that point, the CPU reads the data from the packet and stores it in DRAM 105.
The RapidIO architecture has a layered architecture with three layers:                A logical layer that supports a variety of programming models, enabling an implementation to choose a model that is suitable for the implementation;        A transport layer that supports both large and small networks, allowing implementations to have a flexible topology; and        A physical layer that supports latency-tolerant backplane applications and latency-sensitive memory applications.        
Components of the packets belong to each of these layers.
A typical RapidIO packet is shown at 115. Certain fields are context dependent and may not appear in all packets. The request packet begins with physical layer fields 119. Included in these fields is an “S” bit that indicates whether this is a packet or control symbol. An “AckID” indicates an ID to be used when the packet is acknowledged with a control symbol. The “PRIO” field indicates the packet priority used for flow control. The “TT” field 121, “Target Address” field 125, and “Source Address” field 127 indicate the type of transport level address mechanism being used used, the address of the endpoint device the packet is to be delivered to, and the end point device from which the packet originated. The “Ftype” field 123 and the “Transaction” field 129 indicate the kind of transaction that the destination endpoint is to perform in response to the packet. “Size” field 131 is an encoded transaction size. Data payloads 137 in RapidIO packets are optional and range from 1 byte to 256 bytes in size. “srcTID” field 133 contains a transaction ID that the source endpoint has given the transaction. RapidIO devices may have up to 256 outstanding transactions between two endpoints. For memory mapped transactions, the “Device Offset Address” 135 follows. Data payload field 137 is followed by a 16-bit CRC. Then comes the next packet 141.
In terms of the layers, physical fields 119 and CRC field 16 belong to the physical layer: TT field 121, target address field 125, and source address field 127 belong to the transport layer; the remaining fields belong to the logical layer. RapidIO packets are classified by the values of Ftype field 123 and Transaction field 129 according to the kind of transaction they belong to. The kinds of transactions include:                transactions involving coherent access to globally-shared memory;        transactions involving non-coherent reads and writes;        messaging transactions;        system support transactions;        flow control transactions; and        user defined transactions.        
For the present discussion, system support transactions are of particular importance.
Basic Approaches to the Design of Switches: FIG. 2
In the digital age, two basic approaches are used in the design of switches: crossbars and shared memory. FIG. 2 gives an example of each. At 201 is shown a switch architecture 201 that employs a crossbar switch. Data comes into the switch via a port 204. The incoming data is stored in a buffer 202 belonging to the port; data that is leaving the switch via the port exits at 204. Data that comes in at one port can be made to leave via another port by using crossbar switch 204 to connect the input port's buffer to the output port. The data is then output to the output port. The advantages of a crossbar switch are the following: once the connection is made between two ports, the switch has the bandwidth of the input and output media connected to the ports, and once the connection is made, it takes no significant time for the data to pass through the switch. The disadvantages are that large amounts of memory are required for the buffers, the routing for the crossbar switch is complex, and the number of connections required is enormous. A 24-port crossbar switch requires 552 connections and 35,328 conductors. Because of the complex routing and the large number of connections and conductors, an implementation of the switch made in an integrated circuit requires a large die area.
At 207 is shown an implementation of a switch made using shared memory. All of the ports 209 share access to shared memory 211. Because the memory is shared, arbiter 217 must determine which of the ports gets access to shared memory at any particular time. Shared memory 211 contains two kinds of information: the packets 213 being switched by the switch and a descriptor table which contains descriptors that describe the location and size of each of the packets stored in packet memory 213. The descriptors are organized into queues belonging to the ports. When a packet comes in at a port 209, the port stores the packet in packet memory 213, makes a descriptor for the packet, and places the descriptor on the queue of the port by which the packet is to leave the switch. When a port's queue contains descriptors, the port outputs descriptors until the queue is empty. The bandwidth and latency of switch 207 are determined by how long it takes to store a packet into shared memory, make and place the descriptor in the proper queue in descriptor table 215, and read the packet from shared memory 211. As long as a port has descriptors on its descriptor queue, the port can output packets. The routing for switch 207 is far less complex than that for switch 201; however, a shared memory 211 must be large and the operation of making a descriptor and placing it on the proper queue is too complex to be easily done in simple hardware.
It is an object of the invention to provide a new switch architecture for switches implemented in integrated circuits that overcomes the disadvantages of the switch architectures of FIG. 2 and thereby provide improved switches for use in interconnecting subsystems of a digital system. It is a further object of the invention to provide a device for the RapidIO and similar architectures that is implemented as an integrated circuit, and includes both an endpoint and one or more switches. Other objects and advantages will be apparent to those skilled in the arts to which the invention pertains upon perusal of the following Detailed Description and drawing, wherein:
Reference numbers in the drawing have three or more digits: the two right-hand digits are reference numbers in the drawing indicated by the remaining digits. Thus, an item with the reference number 203 first appears as item 203 in FIG. 2.