A computer processor can be connected to other devices through a device interface which provides temporary storage for incoming and outgoing messages between the processor and device and which handles communication protocols. Communication protocols may include, for example: converting data from parallel to serial form, adjusting the timing of the data, and adding headers or footers to the data for transmission via the device.
Data arrives at the device interface at arbitrary times and is stored there in a buffer memory until it can be read out by the processor. Typically, the processor will be notified of the arrival of data either by an interrupt process or by a polling process. With an interrupt, an electrical line to the processor is electrically strobed causing the processor to stop its current execution of a program and to jump to a new routine for reading the device interface buffer memory. Prior to jumping to the new routine, critical data of the current routine being interrupted must be saved.
With polling, the processor in the course of executing its other programs periodically checks the buffer of the device interface to see if a new message has arrived. The polling is executed on a periodic basis regardless of whether new data has arrived at the device interface, either as part of the programs being executed or as a task in a multi-tasking operating system.
For high speed device interfaces, the interrupt process is a disadvantage because of the time consuming interrupt routines necessary to service the interrupt and to read the device interface buffers. Accordingly, polling techniques are sometimes used in which the processor periodically initiates a reading of the device interface.
The device interface will have a control register which will have one or more flag bits "set" when a message has arrived from the device. In each polling operation, the processor reads the flag bits of the control register, and if warranted, reads a message from a buffer in the device interface. Once the message has been read, the flag bits are "reset". This polling process must be repeated frequently to ensure that the limited-sized buffers of the device interface do not overflow with incoming messages.
Reading the device interface buffers may be slowed by the fact that the device interface is often connected to a specialized I/O interconnect separate from the memory interconnect of the computer. The processor is connected directly to the memory interconnect but communicates with the I/O interconnect through a bridge circuit. The bridge converts a range of physical addresses reserved for the I/O interconnect to physical addresses on the I/O interconnect, allowing the processor to communicate with a single physical address space, some of which is on the memory interconnect and some of which is on the I/O interconnect. This process of transferring data through the bridge is a potential bottle-neck to high speed communication with a device.
In order to speed the processor's writing and reading of data to and from memory, it is known to use a local cache memory associated with the processor and attached to the memory interconnect. Before a block of memory is written to or read from by the processor, it is loaded into the cache memory where it can be written to or read from at high speed without the delay associated with transferring the data over the memory interconnect.
Multiple processors each having a cache memory may be attached to a single memory interconnect and coordinated through a standard cache protocol, for example, the MOESI coherence protocol well understood to those of ordinary skill in the art. In the MOESI protocol, multiple processors can have cache copies of the data of particular memory locations and may read that data without "coherence" problems, that is, without the possibility that different caches will have different copies of the data. A processor may also write to its cache provided it obtains "ownership" of the cache data. This is done by sending out an invalidation message on the memory interconnect which lets the memory and other caches having that memory block know that their data is no longer valid.
A cache that contains data that has been invalidated in this manner may then obtain updated data by sending a message for that updated data to the memory interconnect in a broadcast fashion. The cache having ownership status in the data will then reply. A cache that has ownership of data must write that data to memory before the data is emptied from the cache.
Generally, other caches having invalid copies of the data do not request updated copies unless their associated processors need to read or write that data.
Cache architectures provide for extremely rapid data transfer both by undertaking block transfers of data, in which the overhead of the transfer can be shared among many bits of data, and by having the transfer performed by the cache circuitry in parallel or independently of the operation of the processor.
Cache transfer techniques cannot currently be used for transferring data from a device interface because the data registers, indicating that a message has arrived, are normally demarcated as uncacheable by the operating system. Making these data registers uncacheable is necessary because external occurrence such as the arrival of a message may change the state of the control registers outside of the normal cache coherence protocols, rendering the cache copies invalid but providing no indication of this. Because the contents of the control registers of the device interface are primarily of interest only when they change state (e.g., indicates a new message), caching these registers would seem to provide no benefit.