1. Field of the Invention
The present invention relates to electronic design automation (EDA), and to application of multi-core processing systems to EDA.
2. Description of Related Art
Leading general purpose microprocessors and graphic processors are being implemented using multi-core architectures in single integrated circuits. As a result, multi-core systems are becoming widely available.
Multi-core processing systems, on one or more integrated circuits, are characterized by having from two (2) to many processor cores arranged for concurrently executing threads in symmetric or asymmetric multi-threading programs. The multi-threading programs partition work among the threads on the processor cores for concurrent operation, and can provide significant performance improvements. Multiple concurrently operating threads need access to a common data set and can need access to the results of operations in other threads. Shared access to data among the processor cores is provided using a combination of shared memory and message passing protocols.
Shared memory architectures can vary widely from platform to platform. A common architecture involves the use of shared cache memory. Shared cache memory allows high speed signaling for cache coherency and data access to the executing cores. The cost of implementing cache memory space is relatively high, and so such designs have relied sometimes upon smaller cache sizes. Thus, the shared cache can become a bottleneck in operations requiring large data sets.
In EDA systems, some processes involve convolution operations over large data sets, and can take a very long time to execute. One such convolution operation is referred to as aerial image simulation, used for EDA processes like optical proximity correction in lithographic imaging systems. See, Rieger, et al., U.S. Pat. No. 6,081,658, entitled “Proximity Correction System for Wafer Lithography,” issued 27 Jun. 2000. Rieger et al. is incorporated by reference as if fully set forth herein. For aerial image simulation, layout data that defines a pattern on a photolithographic mask, for example, is convolved with a kernel that determines point-by-point intensity of an image produced by an exposure using a light source represented by the kernel. Often, many kernels are used in the convolution process over a single layout to produce a usable result like an aerial image simulation. Rieger et al. describes an optimization of the convolution process referred to as flash-based convolution, in which the layout data is decomposed into unit shapes called “flashes”, and the intensity results for kernels used in the simulation are pre-computed for the flashes, and stored as basis data in lookup tables. The simulation is simplified in flash-based convolution to a series of table lookup and accumulation operations, and can provide improved performance in many circumstances.
It has been proposed to apply multi-core processing for reducing the computation times for convolution operations used for aerial image simulation and other EDA procedures. See, Wang, et al., U.S. Patent Application Publication No. 2006/0242618 A1, entitled “Lithographic Simulations Using Graphical Processing Units”, published 26 Oct. 2006; and Cong, et al., “Lithographic Aerial Image Simulation with FPGA-Based Hardware Acceleration”, FPGA '08, Feb. 24-26, 2008, Monterey, Calif. Cong, et al. is incorporated by reference as if fully set forth herein. However, the layout data, the basis and the resulting image data can be very large files, so that it would not be practical to place them in memory shared by multiple concurrently executing threads. Therefore, memory operations can become a significant limit to the performance improvements available in prior art systems.
Multi-core and many core architectures such as encountered in graphics processing units can be characterized by strong computing power and relatively weak memory accessibility. Thus convolution operations using these architectures must trade-off computation with memory access.
Problems remain therefore with optimizing convolution algorithms used in EDA systems to take full advantage of multi-core processing, including problems with managing shared memory.