1. Technical Field
The present invention generally relates to superscalar processor systems. More specifically, the present invention relates to selecting data in a cache and outputting the data from the cache to be used by single instruction multiple data (SIMD) instructions.
2. Description of the Related Art
FIG. 1A illustrates a portion of a typical prior art processor system. As shown, the processor system include a memory 105 coupled to a cache 110 that is coupled to floating point registers (e.g., a register file) 115 through a bus 130A. Floating point registers are coupled a floating point unit 120 through a bus 130B. The portion of the processor system shown in FIG. 1A can only process one floating point instruction in a time period.
FIG. 1B illustrates a portion of a typical prior art superscalar processor system that can process two floating point instructions in a time period. As shown, the processor system includes a memory 105 coupled to a cache 110 that is coupled to floating point registers (e.g., a register file) 115 through buses 130A and 130C. Floating point registers are coupled floating point units 120B and 120A through respective buses 130B and 130D. To support processing two floating point operations in the time period, e.g., single instruction multiple data (SIMD), floating point registers must receive two floating point numbers from cache 110. Accordingly, bus 130C is used to provide the second floating point number to floating point registers 115. This additional bus increases a number of output ports for cache 110 which can increase a physical size of cache 110 implemented in a substrate, among other technical issues.