1. Field of the Invention
This invention relates to computer systems and, more particularly, to providing load and store units for vector processors within computer systems.
2. Description of the Related Art
A vector processor or single instruction multiple data (SIMD) processor performs parallel calculations on multiple data elements that are grouped together to from data structures called "vectors." Such vector processors are well suited to multimedia processing which often requires a large number of identical calculations. For example, for video coding, a pixel map representing a video image often contains thousands of pixel values, each of which must be processed in the same manner. A single vector can contain multiple pixel values that the vector processor processes in parallel in a single clock cycle. Accordingly, the parallel processing power of a vector processor can reduce processing time for such repetitive task by a factor equal to the number of data elements in a vector.
All data elements in a vector typically have the same data type, but vector processors may accommodate data elements of different sizes in different vectors. For example, all of the data element in one vector may be 16-bit integers while the data elements in another vector are 32-bit floating point values. Conventionally, data elements and processing circuits in vector processors accommodate possible data widths that are multiples of 8 because convention memories uses addresses corresponding to 8-bit of storage. However, some processing may be more efficiently done using a data width that is not a multiple of eight. Video processing according to the MPEG standards, for example, includes a 9-bit data type. A vector processor that accommodates 16-data elements can process 9-bit values as 16-bit values, but that wastes much of the data width and processing power of the vector processor.
A vector processor can be adapted to process odd size data elements such as a 9-bit data type, if the internal data path and execution units of the processor have the proper data widths. However, if conventional memory is to be used, functional units that access memory, such as a load/store unit in the processor may need to convert vectors having odd size data elements that do not have a simple match to 8-bit storage locations. A load/store unit that efficiently handles vectors with odd data element sizes is sought.