Current computing systems may provide multiple processors partitioned across different chips to reach an economical distribution of transistors and to potentially take advantage of different semiconductor process technology types. Where different process technology types are applied, a high performance process type allows a processor to toggle at a high clock rate at the expense of a high power requirement, while a low power process type offers a lower performance level, usually with substantial power savings over the high performance process type. To permit all processors to work on a common workload under a standard operating system, all processors need to be able to access a common pool of memory.
Memory cannot be easily connected to two different places at the same time. One previously implemented method of handling this problem was to supply a discrete memory controller with the processors located outside of the memory controller and connected to the memory controller as peer-level devices.
With higher performance systems, it is desirable to have the processors and the memory controller integrated in the same chip and the memory directly attached. This construction lowers the access latency to memory for the high performance processor, providing a performance improvement. It also lowers the cost by reducing the number of components in the system. But this construction creates difficulties for a subsequent processor chip, such as the low power processor, to access memory since this processor is not directly connected to the memory. Existing solutions utilize a memory mapped interconnect, for example, Peripheral Component Interconnect (PCI) or HyperTransport.
But for power managed applications, these solutions create another problem: the memory controller and the bus logic within the high performance processor must remain powered on to allow memory access by the low power processor even when the high performance processor is inactive. This leads to a waste of power. It would be desirable to power off as much of the high performance processor as possible, to maximize power savings when only the low power processor is active.
In addition, if the low power processor were to be used as a standalone processor, it would need its own memory controller, thus requiring additional signal pins on the chip for the memory interface. To reduce the cost of the low power processor, it is desirable to keep the pin count and package size to a minimum.
It is desirable to provide a single interconnect between the low power processor and the high performance processor, such that the low power processor can access memory through the high performance processor while the high performance processor can be mostly powered off.
In a scenario where both the low power processor and the high performance processor are active at the same time, are accessing a shared memory, and each have caches, then these accesses need to be cache coherent. Any transactions that the low power processor issues need to go through the high performance chip and snoop its cache. (Snooping is a mechanism to implement cache coherence.) The high performance processor also needs a path to snoop the caches in the low power chip. One solution to this problem is to use the coherent HyperTransport (HT) protocol or another bidirectional communication protocol. But using this protocol requires a separate signal pin interface between chips, in addition to the signal pins required for the dynamic random access memory (DRAM) bus.
The present application proposes a solution to morph the unidirectional DRAM bus into a bidirectional communication bus.