Modern system on a chip (SoC) designs built in deeply scaled process nodes present extraordinary design challenges. Slow wires and process, voltage, and temperature (PVT) variation make the synchronous abstraction increasingly untenable over large chip areas, requiring immense effort to achieve timing closure. The globally asynchronous, locally synchronous (GALS) design methodology is one means of mitigating the difficulty of global timing closure. GALS design flows delimit “synchronous islands” of logic that operate on local clocks and communicate with each other asynchronously.
Individual clock domains in large commercial designs still span many square millimeters, and so many of the design challenges posed by a fully synchronous design persist in GALS systems. The full advantages of GALS design can only be realized if large SoCs are partitioned into myriad small synchronous blocks, not a handful of large areas, an approach referred to as fine-grained GALS. Industry has been reluctant to adopt the fine-grained GALS approach due to three main issues: the difficulty of generating many local clocks, the latency incurred by asynchronous boundary crossings, and the challenge of integrating GALS methodology into standard application specific integrated circuit (ASIC) design tool flows. There is thus a need for addressing these and/or other issues associated with the prior art.