The present invention is related generally to processing architectures, and more particularly, to processing and communication node architectures that process real-time control messages and data transfers to and from memory.
Currently, in a large real-time system, it becomes a problem handling time-critical control messages between subsystems or nodes of the system, particularly when they are mixed with large block data transfers. This problem affects predictability and determinism of latencies for critical messages.
Currently available technology solves the problem of handling time-critical control messages using one of two approaches. The first approach is to implement priority schemes to give priority to time-critical control messages over block data transfers. Alternatively, the second approach uses one type of communication media for control that has a low bandwidth (such as a 1553 bus), and a dedicated link to each processing node.
The first approach lacks commercial support and requires non-standard communication protocols and a one-of-a-kind implementation that is expensive. This first approach also adds complexity and unpredictability to the communication network. The second approach requires two different communication protocols for each node, one for moving data among nodes at a reasonable bandwidth (such as a VME bus) and a different protocol and bus for handling control messages (such as a 1553 bus).
Thus, with regard to the conventional processing architectures, the two alternative design approaches do not offer high bandwidth I/O at low cost. In the first approach, one must live with the bandwidth limitation of other intermediate busses such as PCI, VME, or Raceway busses, for example, or a variety of local busses that provide much less than 500 MB/sec memory bandwidth (and that typically have a peak bandwidth of about 120 to 160 MB/sec). In other words, the high bandwidth I/O device passes data through and is limited by intermediate low bandwidth busses before arriving at the memory. In the second approach, expensive memory controllers and dual port structures are built without taking advantage of low cost commercial processor support chips to provide increased bandwidth to local memory. However, this approach still falls short of a desired half-gigabyte to one gigabyte bandwidth access to memory (which typically requires several ASICs per processor type and providing about 200 to 300 MB/sec).
There are a great number of applications that need or can take advantage of a high bandwidth (500 MB/sec or greater) DMA access to a processor memory at the Low node level. Examples include radar signal processing, electro-optic image enhancement, automatic target recognition, autotracking and image compression applications. There is also a need for an interconnected cache coherent system, both in symmetric multiprocessing (SMP) architectures and in distributed shared memory architectures.
Accordingly, it is an objective of the present invention to provide for improved processing and communication architectures that efficiently process real-time control messages and data transfers to and from memory and that overcome the limitations of the conventional approaches outlined above. It is an objective of the present invention to provide for processing and communication architectures that segregate the control link from the high speed data link. It is another objective of the present invention to provide for processing and communication architectures wherein multiple input/output nodes share a single control link. It is yet another objective of the present invention to provide for processing and communication architectures wherein the high speed data link is directly interfaced to memory and appears as a processor to the memory.
To meet the above and other objectives, one aspect of the present invention is an architecture that efficiently processes time-critical real-time control messages in a large real-time system. The present invention separates control messages from block data transfers and uses identical but separate links to transfer the block data and control messages. One embodiment of the present invention has a single control link shared by multiple processing nodes (to provide for an economy of scale) while each node has an individual high bandwidth data link.
The present invention separates control messages from block data transfers at a node level and shares a single control link among multiple nodes or subsystems. The present invention allows a single control node to process control messages destined for multiple nodes and deliver them to the respective nodes with low latency. The present invention provides low a latency message transmission mechanism as well, picking up control messages from nodes and transferring them by way of a single control link. The present invention, while separating control messages from block data transfers using separate physical links to achieve low latency and predictability, provides a unified protocol for control and data.
One specific implementation uses a shared scalable coherent interface (SCI) control link (either parallel or serial SCI) that serves multiple processing nodes to handle time-critical control messages. Each node has its own 500 MB/sec parallel SCI link to handle block data transfers among nodes or from external sensors. The shared SCI control link handles control messages for all nodes. Thus, a high bandwidth node architecture is provided that allows gigabyte per second I/O bandwidth data transfers directly to processor memory.
The present invention uses a high bandwidth I/O device that mimics (interfaces and behaves like) a second processor and uses multiprocessor support features of a commercial microprocessor and support chips to tie directly to memory using minimal xe2x80x9cgluexe2x80x9d logic. The present invention provides a cache coherent direct memory access link that allows a low cost implementation of a distributed cache coherent system.
The present architecture provides high bandwidth direct memory access and does so with minimal design complexity by using multiprocessing or coprocessing features of existing and planned commercial microprocessors and their support chips. The support for a cache coherent high bandwidth link allows building of low cost symmetric multiprocessing clusters or distributed shared memory architectures.
The present invention improves upon three specific aspects of high bandwidth processing nodes. The present invention provides two alternative ways to connect a high bandwidth I/O link to the node. The present invention provides a simple way of making a high bandwidth I/O device interface like a second processor to commercial microprocessors and support chips (using the multiprocessor support features) to simplify device interface. The present invention specifically provides for the design of a high bandwidth processing node using Power PC(trademark)603/603e/604 processor with an MPC-106 controller and a 500 MB/sec scalable coherent interface (SCI) having direct access to memory. The implementation uses DRAM and a flash EPROM.
The present invention may be adapted for use in radar and electro-optical systems and integrated core processing systems. It may also by used in a wide variety of other signal processing applications. The present invention is well-suited for use with large real-time systems such as a radar system, electro-optical system or an integrated core processing system, for example.