1. Field of the Invention
Embodiments of the present invention generally relate to an emulation engine for emulating a system composed of logic gates, and more particularly, to a method and apparatus for improving the efficiency of the emulation engine.
2. Description of the Related Art
Hardware emulators are programmable devices used in the verification of hardware design. A common method of hardware design verification is to use processor-based hardware emulators to emulate the design. These processor-based emulators sequentially evaluate combinatorial logic levels, starting at the inputs and proceeding to the outputs. Each pass through the entire set of logic levels is known as a cycle; the evaluation of each individual logic level is known as an emulation step.
An exemplary hardware emulator is described in commonly assigned U.S. Pat. No. 6,618,698 titled “Clustered Processors In An Emulation Engine”, which is hereby incorporated by reference in its entirety. Hardware emulators allow engineers and hardware designers to test and verify the operation of an integrated circuit, an entire board of integrated circuits, or an entire system without having to first physically fabricate the hardware.
The complexity and number of logic gates present on an integrated circuit has increased significantly in the past several years. Hardware emulators need to improve in efficiency to keep pace with the increased complexity of integrated circuits. The speed with which a hardware emulator can emulate an integrated circuit is one of the most important benchmarks of the emulator's efficiency, and also one of the emulator's most important selling factors in the emulator market.
A hardware emulator is comprised of multiple processors. The processors are arranged into groups of processors called dusters, and the clusters of processors collectively comprise the emulation engine. During each process cycle, each processor is capable of emulating a logic gate, mimicking the function of a logic gate in an integrated circuit. The processors are arranged to compute results in parallel, in the same way logic gates present in an integrated circuit compute many results in parallel. This creates a chain of logic similar to what occurs in an integrated circuit. In the chain of logic, efficient communication between processors is crucial.
To facilitate data transfer within an emulator, processors within a cluster can receive data directly from the other processors. The output of processors within a cluster is generally stored for a number of cycles within a data array to enable the processors to utilize previous output data in a current computation.
Communication between clusters of processors is generally less efficient than communication within a cluster. A cluster can obtain N inputs (where N is the number of processors in the duster) from any other cluster in the emulation engine. Similarly, each cluster can send N outputs to the other clusters. A duster can receive outputs from signals available during the current cycle inside another cluster. These signals include the current processor outputs, processor inputs, cluster inputs, and memory inputs. Outputs that were produced during a previous cycle must first be fetched from the data array before becoming available to another cluster.
The speed of communication between processors, and between clusters of processors, is directly related to the availability of data to the processors and the clusters of processors. A processor has to use one of its inputs to retrieve data from a data array if the data is unavailable during the current cycle. This reduces the efficiency of the processor. Communication between clusters of processors may also be impeded by lack of an available communication path between clusters. A cluster may have to wait extra cycles for the needed data to be communicated. The extra cycles include a cycle for the data to be retrieved from the data array, and the cycles until a communication path becomes available. This results in slower hardware emulation.
Thus, there is a need in the art for a method and apparatus that improves communication between processors and dusters of processors, and improves the overall efficiency of a multiprocessor based emulation engine.