Electronic Design Automation (EDA) industry has gone through major development in process designs over the past several decades, delivering improved and more sophisticated design tools. Today, these tools provide the critical platform for modern IC designs composed of multi-billion transistors. However, most of these processes, although showcasing tremendous improvements in their capabilities, are based on a sequential Von Neumann machine, with limited or no ability to exploit concurrency. While such limitation did not pose any significant end effect in the past, the advent of commodity multicores created a need to embrace concurrency in many fields, including EDA applications. This need is fast gaining urgency with the trends in Graphics Processor Units (GPUs) development.
GPUs are inherently concurrent designs, with several hundreds of processing units within a single GPU. They not only demonstrate tremendous computation bandwidth (orders of magnitude improvement from commodity multicores), but also the ability to allow non-graphics applications to harness their computing prowess. It is the latter development that will make significant impact in EDA algorithms, as algorithms designed for GPUs in the next decade are poised to bear little resemblance to the existing body of EDA tools. We disclose a novel floorplanning process designed to exploit the GPUs.
Exploiting the tremendous computation bandwidth in a sequential process such as floorplanning is non-trivial. The computation bandwidth in GPUs is geared towards single-instruction multiple thread (SIMT) style data parallel code. The sequential floorplanning process progresses by repeatedly applying a random move on the floorplan, and modifying the floorplan based on the acceptance of the move. A typical run will evaluate thousands of moves, creating a long chain of dependencies (both control and data). This dependency chain must be broken to completely restructure the process for efficient mapping onto a GPU, while preserving the solution quality.
Compute intensive processes like fault simulation, power grid simulation and event-driven logic simulation have been successfully mapped to GPU platforms to obtain significant speedups. One key similarity in all these previous works is the presence of a fixed common topology/data structure across parallel threads that are fed with separate attributes for concurrent evaluation (e.g. distinct gate sizes and threshold voltages for a single circuit topology, distinct input patterns for a single circuit). The unmodified topology is highly amenable to the SIMT style in GPUs, as it does not require frequent data modification and reuse. In contrast, the floorplanning process that we target poses a severe challenge to SIMT style platforms, as it involves a chain of dependent modifications to the data structures.