In modem computer systems, relatively high speed processors manipulate data sourced from memories, and other system components that generally have slower and different operating characteristics than the processor. For example, in a system with hierarchical memories, the data can be persistently stored in relatively slow storage devices such, as disk and tape. Alternatively, the data can be sourced externally from other processors, networks, or input/output devices via I/O interfaces.
Data which are immediately manipulated by the processor are typically stored in faster, but smaller and volatile semiconductor random access memory (RAM). One or more small and high-speed cache memories are usually arranged between the processor and the RAM. The caches, relying on spatial and temporal relationship between data and addresses, store data which have a high likelihood to be used by the processor.
Cache memories can be configured to be physically separate from the processor, e.g., "off-chip." Additional cache memories can be arranged to be co-resident with the processor on the same semiconductor die, e.g., "on-chip." In the later case, the cache memories can be highly specialized. For example, data and instructions for manipulating the data may be stored in separate on-chip caches.
Typically, the processor, memories, and I/O components are interconnected by communication buses that transport timing, control, address, and data signals. The processor, memories, and other system components that share the data can have distinctly different electrical operating requirements and characteristics which may require multiple bus architectures.
For example, the processor and the on-chip cache memories are usually operated by control and address, or "index" signals synchronized to timing signals derived from a high-speed processor clock. The off-chip memory and system components are usually operated by signals synchronized to a slower system clock. The signals used to operate the on-chip and off-chip components, respectively, may have different frequencies, shapes, e.g. length and height, latencies, and protocols. For example, it is not unusual to run the processor clock orders of magnitude faster than the system clock. On-chip components generally run synchronously with respect to timing signals forwarded with the control and address signals, Off-chip components can run asynchronously with respect to skew controlled and radially distributed timing signals.
For these reasons, the electrical environments of the system can be partitioned into separate operating regions or "domains." The processor, and other on-chip components process digital signals in a processor or "private" domain, and the off-chip components process the digital signals in a system or "external" domain.
Processing digital signals in a computer system having multiple operating domains presents a throughput problem. For example, should the processor require access to data that are not accessible in the private domain, e.g., data processed by on-chip high-speed digital signals, then the data needs to be accessed in the external domain using slower signaling environments.
In traditional computer systems, switching operations from one domain to another generally increase access latencies. This is a particular problem for a clock sensitive device such as the off-chip cache that is immediately adjacent and external to the processor chip. In traditional computer systems, the first level of off-chip cache is usually restricted to operate only in the external domain, thus drastically decreasing throughput.
Therefore, there is a need for an apparatus and method which can improve the throughput of computer systems having multiple operating domains and clock sensitive components.