The present invention relates to computer bus systems, and more specifically, to apparatus and methods for identifying violations of bus protocols.
A bus is like a highway on which data travel within a computer. It is simply a channel over which information flows between two or more devices. A bus normally has access points, or places by which a device can become attached to the bus. Devices on the bus send information to, and receive information from, other devices on the bus. For example, a processor bus is the bus that devices and processors use to communicate with each other. Computer systems also typically include at least one input/output (I/O) bus, such as a peripheral component interconnect (PCI) bus, which is generally used for connecting performance-critical peripherals to a memory, other devices, and the processor. For example, video cards, disk storage devices, high-speed network interfaces generally use a bus of this sort. Personal computers (PC) typically also include additional I/O buses, such as an industry standard architecture (ISA) bus, for slower peripherals such as a mouse, a modem, a standard sound card, a low-speed networking interface, and also for compatibility with older devices.
Each transaction initiated on a processor bus typically goes through three general stages: an arbitration phase, an address phase, and a data phase. For a component connected to the bus to initiate a transaction on the bus, the component must obtain xe2x80x9cownershipxe2x80x9d of the bus. This happens during the arbitration phase. The transaction begins with the component initiating the transaction, known as the requesting component, signaling that it wants to use the bus. Once the requesting component acquires bus ownership from a bus arbiter, the component sends an address out on the bus during the address phase that identifies the target of the transactionxe2x80x94the target component. All components on the bus receive the address and determine which of them is the target component. Finally, during the data phase, the requesting component places data, which may be a command or request, on the bus for the target component to read.
These general stages of a bus transaction may be further divided into additional phases on more complex buses. These additional phases may include the following: arbitration phase, request phase, error phase, snoop phase, response phase, and data phase.
Specific bus transactions, such as data reads or writes, may take a relatively long time to complete, or the target component may be busy, and therefore, not available to immediately complete the request. In cases such as these, the target component may choose to defer responding to the transaction request to a later time, in which case the target component is called a deferring component. Further, the target component may assert a retry signal, notifying the requesting component that the target component cannot handle the transaction now, and the requesting component should try the transaction again later.
The requesting component keeps track of each transaction it initiates. Typically, the requesting component records transaction information, such as the target component""s address, transaction type, transaction phase, etc., in a buffer. In some embodiments the transaction information is stored in a value referred to as a transaction identifier (TRID). A typical transaction proceeds through the various phases and completes, then the transaction information is removed from the buffer, making room for additional transactions.
If a transaction is deferred, the transaction information is kept in the buffer until the transaction completes. For example, in a memory read transaction issued by a processor, the processor may provide an identification of the request type and the memory address from which to read the data during the request phase. If the component memory controller) cannot handle the request immediately, or if the transaction will take a relatively long time to complete, it may defer the request. The memory controller may complete the memory read at a later time, and then initiate another transaction to provide the data to the processor. The information regarding the original memory read transaction must be stored in the buffer until the memory provides the data in the subsequent bus transaction, so that the processor can determine with which transaction the received data is associated.
In systems employing error detection, detection of the widest variety of errors at the earliest possible time is beneficial. Each instruction cycle that an error remains undetected allows the error to propagate and potentially create greater damage (e.g., corruption of memory contents). In certain mission-critical applications, such as air-traffic control, military systems, financial systems, and other emergency systems (e.g., xe2x80x9c911xe2x80x9d service), high-availability and high-reliability computing is required. Typically, redundant, fault-tolerant systems are employed because the reliability and availability of individual system components is insufficient to satisfy the overall system requirement. Fault-tolerant systems detect and identify errors to isolate faulty system components, switch-over to redundant system components, and preserve system integrity. However, such systems typically check for each kind of error sequentially thereby reducing the throughput of the system.
The present invention addresses the above discussed, and other, shortcomings of the prior art.
Systems and processes for an improved protocol error detector are disclosed which are useful in a wide variety of applications including detecting protocol violations in a computer communications bus, but which are not limited to a synchronous communications bus.
According to one aspect of the invention, a protocol error detector comprises a physical error detector detecting physical protocol violations, a sequential protocol error detector detecting sequential protocol violations and a logical protocol error detector detecting logical protocol violations. The protocol error detector signals a bus transaction error when at least one of the physical error detector, the sequential error detector, or the logical error detector detects a protocol violation. In one embodiment, the protocol error detector substantially simultaneously checks a bus transaction, or phases of a bus transaction, for physical, sequential, and logical protocol violations. In another embodiment, the protocol error detector provides an error signal upon the detection of one or more of the physical, sequential, and logical protocol violations, substantially coincident with the bus transaction, or phases of a bus transaction.
In another aspect of the invention, the protocol error detector is used within a redundant fault-tolerant bus architecture, checking a bus transaction, or phases of a bus transaction for protocol violations on each of a plurality of redundant busses. In one embodiment, the busses are synchronous locked-step busses each carrying the same bus transaction and bus phase information on the same clock cycle. The same bus transactions on each of the redundant busses are then compared for equivalence by a voter. System faults are identified when a miscompare occurs at the voter. The protocol error detector detects a protocol violation and reports the detected protocol violation to the voter to assist in the identification of system faults as soon as possible and to assist in the identification and isolation of the actual faulty component.