The present invention relates to a single-chip integrated microprocessor that interfaces directly with external memories and external input/output units. More specifically, the present invention relates to an apparatus and method for ensuring cache coherency in such a microprocessor.
Microprocessors that share and exchange data stored in external memories, such as cache and main memories, with one or more external input/output units are commonplace in today's computing environment. Such a microprocessor system 5 is shown in FIG. 1 and may include a processor 10, a cache memory 12 and a main memory (e.g., a DRAM) 14 communicatively coupled through a bus 16. A cache controller 18 interfaces cache 12 to bus 16 and a main memory controller 20 interfaces memory 14 to bus 16.
Most conventional processors, such as processor 10, have some onchip input/output (I/O) support logic but rely on one or more external components to complete an I/O system for complex bus interfaces between external input/output (I/O) devices, such as I/O devices 22, 24 and 26, which also communicate with memories 12 and 14 through bus 16. Such external components may include an input/output bridge unit 30. Bridge unit 30 controls access of such external I/O devices to bus 16, and a second bus system 28 connects the I/O devices to input/output bridge 30. Second bus system 28 includes an arbitration system that prioritizes access of I/O devices 22, 24 and 26 to the bus. Second bus system 28 may be an industry standard bus system such as the PCI bus protocol. External I/O devices 22, 24 and 26 may be, for example, a graphics/multimedia card, a communication device and an audio card.
Bus 16 preserves coherency between cache 12 and memory 14 through the manipulation of various status bits such as modified, shared, exclusive and invalid bits associated with each memory location. Bus 16 may be a proprietary bus, such as Sun Microsystem's S-Bus. It is common for the functionality of each of the above units in computer system 5 to be implemented on different chips. For example, processor 10, cache 12, main memory 14, cache controller 18, main memory controller 20 and input/output bridge 30 may all be on separate chips. Thus, such systems are referred to as multichip computer systems.
Such multichip solutions are typically connected with standard buses (e.g., a PCI bus as discussed above), which cause a de facto ordering of coherency events. Also, in such multichip solutions, direct memory access (DMA) requests from external I/O devices, such as devices 22-26, do not generally access cache 12. Instead, such instructions generally go directly to main memory 14.
Because of market and technology demands, chip designers are putting more and more functionality of microprocessor system 5 shown in FIG. 1 on a single chip as an integrated processor. A single-chip processor has reduced latency time because signals are not shuttled between as many components, is less expensive to manufacture and saves considerable space in that area is not required for connection of as many multiple components. In such an integrated processor, however, data coherency between the various external cache and main memories must still be retained to run existing software.
There is no de facto ordering when I/O, memory and cache interfaces are all on a single chip. Instead, a design based around chip-level issues rather than system interconnect issues becomes optimal in terms of chip area used and overall performance. It is possible in such an integrated processor to incorporate a bus similar to bus 16 on the integrated processor to preserve data integrity. Chip size and performance issues make such a tactic undesirable for many applications, however. For example, the chip area required by such a bus system with its handshaking logic, etc. is substantial. Also, latency delays and the upkeep required with such a data coherency bus, results in undesirable speed penalties. Accordingly, other methods of maintaining data coherency in such integrated processors are desirable.