Advances in the computing, telecommunications, and other electrical arts continue to demand increased data throughput and decreased data latency from their constituent subsystems. As designs for these subsystems tend towards a modular approach, however, data transfer and associated data control between each module tends to not only decrease data throughput, but also increase data latency. One such modular system may be exemplified by the Cellular MultProcessing (CMP) architecture, which is used in today's high end computing platforms. CMP employs a Symmetric Multiprocessor (SMP) design, which further employs multiple modular components, such as high throughput memory, Input/Output (I/O) systems, and supporting hardware elements to bring about the manageability and resilience required by these computing architectures.
As with any modular design, however, SMP systems are faced with the daunting task of managing large quantities of asynchronous data transfer between their associated processors, cache, and system memory. Data transfer between, for example, one processor to another generally requires the use of a data cache and an associated data interface. The amount of data transferred between the data interfaces, however, is not a constant value, but is rather dependant upon the particular type of data transfer taking place.
For example, a continuous data transfer may pertain to a data block, e.g., cache line, where each data byte of the cache line is transferred in contiguous order using sequential address clocking. Conversely, a data interface may also transfer partial cache lines, whereby a variable number of data bytes are transferred for each cache line. Prior art data interfaces, however, add delay to the partial cache line transfer, due to the additional addressing clock cycles that are needed to synchronize the data interface to the beginning of the next cache line. In other words, the non-transferred data bytes of each cache line are “skipped over” by executing a No Operation (NoP) for each address clock cycle associated with the non-transferred data bytes. Thus, each NoP necessarily decreases data throughput and data latency by adding the delay necessary to synchronize the data interfaces.
A need exists, therefore, to provide a method and apparatus that allows a variable delay data interface that provides on-demand output data without adding latency or dead cycles.