1. Technical Field
The present invention relates in general to data processing and, in particular, to the storage subsystem of a data processing system. Still more particularly, the present invention relates to a data processing system having asymmetrically partitioned load-store hardware.
2. Description of the Related Art
In order to capitalize on the high performance processing capability of a state-of-the-art processor core, the storage subsystem of a data processing system must efficiently supply the processor core with large amounts of instructions and data. Conventional data processing systems attempt to satisfy the processor core""s demand for instructions and data by implementing deep cache hierarchies and wide buses capable of operating at high frequency. Although heretofore such strategies have been somewhat effective in staying apace of the demands of the core as processing frequency has increased, such strategies, because of their limited scalability, are by themselves inadequate to meet the data and instruction consumption demands of state-of-the-art and future processor technologies operating at 1 GHz and beyond.
To address the above and other shortcomings of conventional processor and data processing system architectures, the present invention introduces a processor having a hashed and partitioned storage subsystem. The processor includes at least one execution unit, an instruction sequencing unit coupled to the execution unit, and a cache subsystem including a plurality of caches that store data utilized by the execution unit. Each cache among the plurality of caches stores only data having associated addresses within a respective one of a plurality of subsets of an address space.
In one preferred embodiment, the execution units of the processor include a number of load-store units (LSUs) that each process only instructions that access data having associated addresses within a respective one of the plurality of address subsets. The load-store units can have diverse hardware such that a maximum number of instructions that can be concurrently executed is different for different load-store units or such that some of the load-store units are restricted to executing certain classes of instructions.
The processor may further be incorporated within a data processing system having a number of interconnects and a number of sets of system memory hardware that each have affinity to a respective one of the plurality of address subsets. All objects, features, and advantages of the present invention will become apparent in the following detailed written description.