The present disclosure relates generally to a method and apparatus for testing the validity of shared data in a multiprocessing system, and particularly to a method and apparatus for random testing data block concurrency, execution sequence and serialization in multiprocessing systems in response to pseudo-random instruction streams.
In a typical multiprocessing environment, a number of central processing units (CPUs) share data residing in a particular location of the main/cache memory. These CPUs may be required to obey certain rules such as block concurrence, execution order and serialization. The block concurrence rule states that while a given CPU is fetching from or storing into a block of shared memory, the operation must be done in indivisible units. For example, if a fetch unit is four bytes, then the CPU is not allowed to fetch the data in two or one byte increments. The execution order rule requires that all stores and fetches are executed in order. In addition, the serialization rule refers to the ability of the CPU to use pre-fetched data until the CPU serializes, at which point it discards all pre-fetched data and completes all pending stores before it executes new fetches.
An access by an instruction to a memory location may be a fetch, a store or an update. In the case of the fetch access, the instruction does not alter the contents of the memory location. In contrast, the store access type instruction modifies the contents of the memory location. An instruction that fetches from and then stores into the same memory location performs an update access type. An update access operation can be viewed as a fetch operation followed by a store operation.
Several random test methodologies regarding cache coherence of the weakly-ordered (e.g., Reduced Instruction Set Computer (RISC)) architecture, such as PowerPC, have been reported in the literature. In the data-coloring technique, each store operation writes unique data into the common memory location. In this method if instructions storing into the shared locations are allowed to access a storage location consisting of a combination of shared and non-shared locations, one has to account for the worst case and thus limit store operations affecting these shared locations to 256 (maximum number of distinct data colors that can be represented with one byte). Therefore, the number of instructions of a given CPU that could write into the shared location becomes severely limited as the number of CPUs increases. The use of monotonic input data for multiprocessors and analyzing the terminal results is a special case of the data coloring technique. The false sharing method partitions the shared memory location such that different CPUs store into and fetch from non-overlapping locations. Other approaches force synchronization instructions before each store operation that stores into the shared location. However, these techniques concentrate on weakly-ordered architectures such as PowerPC and do not consider when arithmetic and logic operations engage in register-to-memory, memory-to-register, register-to-memory or memory-to-memory data transfer operations. Furthermore, the fact that an instruction stream and translation tables can be subject to modifications due to the actions of some CPUs are not taken into account.
In the case of a firmly-ordered architecture (e.g., Complex Instruction Set Computer (CISC)), such as IBM's System 390 architecture, a store operation executed by an instruction can take place (as seen by other CPUs) only after all the preceding store operations are completed. Furthermore, both logical and arithmetic operations of the IBM System 390 architecture can involve register-to-memory, memory-to-register and memory-to-memory operations. However, unless another CPU observes, one cannot conclude that any of the store operations of a given CPU are completed. Hence, determining the validity of the fetched data from a memory location is complicated when more than one CPU may have stored data into that location. Data fetched by a CPUi from a memory location M which has been subject to a number of store operations by different CPUs may not adhere to certain rules such as the block concurrence, execution order and serialization rules.