A computer system can be broken into three basic blocks: a central processing unit (CPU), memory, and input/output (I/O) units. These blocks are interconnected by means of a bus. An input device such as a keyboard, mouse, disk drive, analog-to-digital converter, etc., is used to input instructions and data to the computer system via the I/O unit. These instructions and data can be stored in memory. The CPU retrieves the data stored in the memory and processes the data as directed by the stored instructions. The results can be stored back into memory or outputted via the I/O unit to an output device such as a printer, cathode-ray tube (CRT) display, digital-to-analog converter, LCD, etc.
In one instance, the CPU consisted of a single semiconductor chip known as a microprocessor. This microprocessor executed the programs stored in the main memory by fetching their instructions, examining them, and then executing them one after another. Due to rapid advances in semiconductor technology, faster, more powerful and flexible microprocessors were developed to meet the demands imposed by ever more sophisticated and complex software.
In some applications, multiple agents (e.g., microprocessors, co-processors, digital signal processors, etc.) are utilized. A singularly complex task can be broken into sub-tasks. Each subtask is processed individually by a different agent. For example, in a multi-agent computer system, word processing can be performed as follows. One agent can be used to handle the background task of printing a document, while a different agent handles the foreground task of interfacing with a user typing on another document. Thereby, both tasks are handled in a fast, efficient manner. This use of multiple agents allows various tasks or functions to be handled by other than a single CPU so that the computing power of the overall system is enhanced. And depending on the complexity of a particular job, additional agents may be added. Furthermore, utilizing multiple agents has the added advantage that two or more agents may share the same data stored within the system.
Typically, agents on a bus initiate transactions by driving valid lines on an address and request signal group, along with a strobe indicating the beginning of a new transaction. However, these signals are sometimes corrupted by "soft" errors. Hence, the address, request, and strobe signals are often protected by using one or more parity bits to detect these errors. If a parity error is detected, the agent observing the parity error asserts an error indication and that signal is then retried. On retry, most soft errors are eliminated, thereby increasing the system availability.
In addition to addressing errors, there might also be arbitration errors. Typically, any agent desirous of issuing a new bus transaction must first successfully complete an arbitration phase before it is allowed to issue that new bus transaction. In other words, before an agent is allowed to perform a transaction (e.g., a read or a write operation), it must be granted access to the shared bus (i.e., granted bus ownership). In a distributed arbitration scheme, each requesting agent has an arbitration signal that it uses to arbitrate for ownership of the bus. Given such a distributed arbitration scheme, parity protection on the arbitration signals is signal intensive. Each arbitration signal needs one parity signal coverage. Thereby, arbitration errors are protected under the request error detection and retry mechanism.
However, an arbitration signal failure might result in more than one agent determining itself to be the new bus owner. If these multiple agents issue a new bus transaction at different times, it may be detected by the other agents as a protocol violation. But when the requests are issued exactly at the same time, a problem arises in that a common strobe with a different request or address encoding might cause a request or address parity error. This problem is especially troublesome because the same problem might be repeated on retry. In other words, the retry will recreate the same conditions leading to the exact same error being committed again.
Another problem pertains to how locked sequence atomicity is managed. Lock semantics are often used by multiple agents to determine ownership of a shared bus. For example, a first processor may establish a data structure in a memory device for a second processor to read at some future time. The data structure has a flag, or "lock" variable, which is initially reset by the first processor. The lock variable is then set by the first processor after the data structure is established. By monitoring the lock variable, the second processor is capable of determining whether it may safely access the data structure and avoid reading stale data.
A situation may also arise whereby multiple agents desire access to the same data structure. If the data structure can be read from and written to, a mechanism is needed to ensure that only one of the agents can access the data structure at any given time. This can be achieved by using the lock variable to inform the respective agents as to whether the data structure is currently in use. Hence, an agent must first acquire the lock in order to acces the data structure.
The complication lies in that an arbitration failure might occur in the middle of a lock sequence. After an arbitration retry, the same agent is not guaranteed to immediately regain ownership of the bus. Further complicating matters is the fact that once a lock sequence is initiated, it is necessary to complete the entire lock operation in order to preserve the atomicity of the lock variable. An "atomic" operation is defined as an operation consisting of multiple transactions which must be processed on the bus without interruption by another agent. For example, an acquire lock operation must be allowed to read the lock variable and write the lock variable without a second processor performing a read or write operation in the meantime. Allowing a second agent to interfere with the first processor's lock operation might result in both agents believing that they had access to the data structure, which would destroy the purpose of the lock variable. Furthermore, these problems are even more complicated when applied to agents having a pipelined bus architecture, wherein locked and unlocked transactions are simultaneously progressing through the various pipe stages.
Thus, there is a need for an apparatus and method of handling address and request errors in a multi-processor system. It would be prefertable if such an apparatus and method also provides protection for arbitration signals. It would also be highly prefertable if such an apparatus and method could maintain lock atomicity.