Sophisticated System on Chip (SoC) designs are rapidly evolving. The SoC designs being developed today have millions and millions of gates. The complexity of the designs combined with the use of devices using the SoC designs in industrial products of very high importance has made design verification an essential element in the semiconductor development cycle. Thus, prior to manufacture, the hardware designers frequently employ simulators and/or emulators to verify the functional behavior of the electronic devices and systems fabricated in accordance with their designs. One type of verification system for a hardware device under test (DUT) is a hardware emulation process that generally provides an effective way to increase verification productivity, speed up time-to-market, and deliver greater confidence in the final SoC product. It provides a processor based simulation acceleration/emulation system, which is in communication with a workstation that sends data to and from the DUT. Such data can include digital test vectors or real signals from a logic system in which the DUT is intended for.
Various mechanisms/models have been employed in the hardware emulation systems in order to transfer the data between the hardware emulators and the host workstation. One of the models present in the art to transfer the data between the hardware emulators and the host workstation is a Primary input/output (PIO) based data transfer model. FIG. 1 is a schematic diagram showing a prior art PIO-based data transfer model employed in an emulation system 100. The schematic diagram illustrates the emulation system 100 employed in the electronic design automation for verifying that a logic design conforms to its specification before the logic design is manufactured as integrated circuits (ICs). In the emulation system 100, a test bench 102 is established to perform verification processes on the logic design. Typically, the logic designs and test designs may be described using various languages, such as hardware description languages (HDLs) or other more abstract languages. The functional verification is being performed using an emulation process. In the emulation process, the logic design is being mapped into a hardware emulator 104 to provide a design under test (DUT) 106. The test bench 102 is being executed by a simulator on a host workstation 108. As shown in FIG. 1, data 110 is present in the DUT which has to be transferred to a memory buffer 112 on the test bench 102 running on the host workstation 108. In one example, the host workstation 108 may request the hardware emulator 104 to transfer the data for executing one or more tasks that require the use of the data. In order to transfer the data to the host workstation 108, the HDL process make a call with the data which is generated, in a same emulation/user clock cycle. As understood in the art, several approaches such as signal-level connections, high-level abstract message passing, and function-call-based interaction have also been employed to make the call. The function-call-based interaction is a common approach which is being employed. In the function-to-call approach, the data transfers is performed using function call arguments and is known as Direct Programming Interface (DPI).
In FIG. 1A, the HDL process makes a blocking DPI call with an input data that is generated in the same user clock cycle. The hardware emulator 104 then stops the clock cycle, and transfers the input data to the host workstation 108. The hardware emulator 104 then waits for output data from the host workstation 108 and then resumes the execution process once the output data from the host workstation 108 is available. In order to transfer the input data in the same user clock cycle, the hardware emulator 104 uses PIO pins 114. A compiler of the hardware emulator schedules all the PIOs 114 to be transferred at the same user clock cycle, because all of the input data needs to be available on the host workstation 108 at the same time. Thus, depending on the size (in bits) of the input/output data to be transferred, similar number of bits of the PIO pins 114 are utilized. If the size of the input/output data set becomes larger, it becomes a challenge for the compiler of the hardware emulator 104 to schedule all of the input/output data in one single timestamp. The challenge arises because PIO pins 114 are hardware resources and there is a pre-defined fixed number of PIO pins 114 present in the hardware emulator 104 of the emulation system 100. Thus, as the input/output data becomes larger than the pre-defined fixed number of PIO pins 114 connected to the hardware emulator 104, the compiler of the hardware emulator 104 fails to compile all of the input/output data in one single timestamp. At the same time, in a situation when the input data is in limits with respect to the fixed number of PIO pins 114, then although the compiler of the hardware emulator 104 is able to successfully compile the input data using the PIO pins 114 for transfer to the host workstation 108, but performance is not efficient because that large number of pins has to be transferred and optimal scheduling isn't achieved.
FIG. 1B is a schematic diagram showing a prior art memory-based data transfer model that is employed in the emulation system 100 to address the limited size data transfer drawback of the PIO-based data transfer model described in FIG. 1A. As illustrated in FIG. 1B, there is data 110, that may be present in the hardware emulator 104 or the host workstation 108. In one example, the data 110 is present in the hardware emulator 104 and has to be transferred to the host workstation 108. In order to transfer the data 110 from the hardware emulator 104 to the host workstation 108, the hardware emulator 104 may facilitate the transfers of the data 110 onto a memory 116 (positioned in the hardware emulator 104) using a plurality of memory ports 118. The host workstation 108 can then read/write the data 110 from the memory 116. The compiler of the hardware emulator 104 will schedule all of the data 110 from the memory ports 118 to be transferred at the same user clock cycle, because all of the data 110 needs to be available at the memory 116 at the same time. Thus, depending on the size (in bits) of the input/output data to be transferred, large number of memory ports 118 may have to be utilized, but this solves the limited size data transfer drawbacks of the PIO-based data transfer model as large amount of the data 110 can be transferred using the memory ports 118. Also, the current memory based transfer model uses large number of memory ports 118 to copy the large amounts of data 110 to the memory 116 in the same user clock cycle to avoid using other available cycles to copy the data 110 to the memory 116 since the use of other cycles to copy the data 110 affects the schedule and performance of the emulation system 100. However, it has been observed that for large sizes of data transfer using the memory-based transfer model, the large number of the memory ports 118 required in same clock cycle becomes the bottleneck and the performance is not efficient because optimal scheduling isn't achieved and the performance becomes extremely slow at the run-time.
Therefore, there is a need in the art for methods and systems that addresses the above mentioned drawbacks of the conventional techniques employed for data transfer in an emulation system, and thereby able to achieve optimal performance for compiling time as well as runtime when large amount of data has to be transferred between the hardware emulator and the host workstation of the emulation system.