Typically, integrated-circuit interconnects are categorized according to their length, generally falling within three categories: (1) local interconnects that connect gates within a block, (2) semi-global interconnects that connect blocks (e.g., datapath blocks to a register file), and (3) global interconnects that connect functional units together such as a processor core with the L2 cache, the bus interface, or another processor core in a chip multiprocessor. As CMOS processes scale, local interconnect length, and to a lesser extent semi-global interconnect length, scales as well. However, global interconnects tend to become longer because maximum die sizes are increasing. Global interconnect capacitance increases since capacitance per unit length remains constant or only slightly decreases as processes scale. When coupled with resistance increases per unit length, the result is slower absolute global data transfer, and much slower transfer relative to increasing system clock frequencies. Thus scaling results in both more fCV2 power and increased latency in the global interconnects. A solution that addresses both of these problems is needed.