The technical field relates generally to computer architecture and more particularly, but not by way of limitation, to a system for testing the design of a computer system by making the hardware state of the system repeatable.
In the field of integrated circuit (IC) chip and computer system design, it is necessary to test the chips and systems to identify any bugs that may exist. The testing of computer hardware involves testing the hardware using test program code sequences. The test programs attempt to cause improper computer operation by creating different time sequences of operations. To achieve sufficient test coverage, many prototype systems execute code at once.
However, debugging becomes more difficult as the level of system complexity increases. The debugging process is expensive in terms of time spent identifying bugs and the equipment that must be used in this process. In complex systems, it is impractical to monitor every signal from a system under test. Instead, when an error is detected during a test, the test is re-executed on an instrumented prototype to isolate the cause of the failure. To reproduce the test failure on the instrumented prototype, it is essential that the test program execute with the test hardware in exactly the same state that existed when it executed on the original system on which it failed. When a system is designed so that an executing test program finds the same hardware state during its execution as it encountered during its previous execution, the system is said to be repeatable. If the system is not repeatable, then the debugging process takes substantially longer because the same error or bug may not appear in a subsequent run of the same test program.
Unfortunately, without taking special measures, systems are not repeatable. One source of non-repeatability relates to arbitration between various sources in the system. In particular, systems may poll multiple data ports, or perform some other function, over a period of time in order to process data from the various ports. Without special measures being taken, the system hardware may be polling a different port each time a section of test code is executed. An error may be detected only when a particular data port is polled at a certain time. If the system is in a different state, polling a different data port on every test, repetition of the error may not be revealed. What is needed is a means of ensuring that the system will be in the same state each time that the test code is executed.
A method and apparatus are disclosed for improving the repeatability of a system during testing by ensuring that the machine state is the same during every test repetition. In particular, the system ensures that the polling block of a cross-bar chip is reset to poll the same port starting at the same time relative to the start of every repetition of the test. The system uses a global framing clock (xe2x80x9cGFCxe2x80x9d) that operates at a lower frequency than the system clock as a common timing reference. The GFC is designed to have a common rising edge that corresponds to a rising edge on every other clock used in the system and is used to synchronize other system clocks. Before executing test code, the processor executing the test waits for the system to become idle and then waits for a rising edge of the GFC. The processor then sends a message across existing links from itself to a cache controller chip. The cache controller chip waits for the next GFC edge and then sends a reset message to the cross-bar chip across its link to reset the CSR polling block. The cross-bar chip receives the message and resets the CSR polling block on the next GFC edge.
In a system using multiple cross-bar chips with multiple cache controller chips connected thereto, the CSR polling blocks in each of the cross-bar chips, or a subset thereof, may be reset using the method. The controlling processor sends a reset message through the cross-bar chips to one of the cache controller chips associated with each cross-bar chip, beginning with the cross-bar chip furthest away. Each of cache controller chips send a reset CSR polling command to their associated cross-bar chips, which causes the CSR polling blocks to be reset. Each time that the test is executed, the method and apparatus ensure that the polling is reset at the same time relative to test execution.