Sophisticated System on Chip (SoC) designs are rapidly evolving. The SoC designs being developed today have millions and millions of gates. The complexity of the designs combined with the use of devices using the SoC designs in industrial products of very high importance has made design verification an essential element in the semiconductor development cycle. Thus, prior to manufacture, the hardware designers frequently employ simulators and/or emulators to verify the functional behavior of the electronic devices and systems fabricated in accordance with their designs. One type of verification system for a hardware device under test (DUT) is a transaction-based acceleration verification process that generally provides an effective way to increase verification productivity, speed up time-to-market, and deliver greater confidence in the final SoC product. It provides a processor based simulation acceleration/emulation system, which is in communication with a workstation that sends data to and from the DUT. Such data can include digital test vectors or real signals from a logic system in which the DUT is intended for.
Various mechanisms/models have been employed in the art in order to transfer data comprising channel packets between the components of the emulation system. FIG. 1 depicts a conventional models to stream data between components of an emulation system 100. As illustrated, there is a hardware accelerator 102 and a host workstation 104. The hardware accelerator 102 comprises one or more producers 106. The one or more producers 106 are configured to produce one or more channel packets. The host workstation 104 comprises one or more consumers 108. The one or more consumers 108 are configured to receive the one or more channel packets produced by the one or more producers 106. In order to stream the one or more channel packets, from the one or more producers 106 to the one or more consumers 108, a dedicated accelerator memory 110 is allocated for each of one or more channels through which the one or more channel packets are routed. Each dedicated memory 110 is configured to handle its corresponding channel traffic. Thus, when a synchronization is requested by the one or more channels, then their corresponding memory 110 is uploaded separately by the corresponding one or more channel packets. Because each memory 110 read requires a PC latency, therefore this model requires multiple PC latencies as the multiple memories 110 are accessed/read separately. Due to the presence of multiple PC latencies, the speed of the operation becomes very slow.
Therefore, there is a need for methods and systems that addresses the above mentioned drawbacks of the conventional technique employed for data streaming in the emulation system, and thereby able to achieve optimal performance and higher speed for streaming of the data between the hardware accelerator and the host workstation in the emulation system.