Wide data channels can make timing closure problematic, as such wide data channels may span long distances across an IC and/or between multiple ICs (“Super Logic Regions” or “SLRs”) mounted on an interposer. With reference to programmable logic devices, such as for example FPGAs, timing may be improved by adding stages of register slices to a circuit design. However, addition of such stages of register slices incurs a significant overhead penalty in terms of circuit resources and semiconductor area.