This invention relates in general to computer systems, and in specific to an arrangement for a cache memory system rotator.
Computer systems may employ a multi-level hierarchy of memory, with relatively fast, expensive but limited-capacity memory at the highest level of the hierarchy and proceeding to relatively slower, lower cost but higher-capacity memory at the lowest level of the hierarchy. The hierarchy may include a small fast memory called a cache, either physically integrated within a processor or mounted physically close to the processor for speed. The computer system may employ separate instruction caches and data caches. In addition, the computer system may use multiple levels of caches. The use of a cache is generally transparent to a computer program at the instruction level and can thus be added to a computer architecture without changing the instruction set or requiring modification to existing programs.
Computer processors typically include cache for storing data. When executing an instruction that requires access to memory (e.g., read from or write to memory), a processor typically accesses cache in an attempt to satisfy the instruction. Of course, it is desirable to have the cache implemented in a manner that allows the processor to access the cache in an efficient manner. That is, it is desirable to have the cache implemented in a manner such that the processor is capable of accessing the cache (i.e., reading from or writing to the cache) quickly so that the processor may be capable of executing instructions quickly. Caches have been configured in both on chip and off-chip arrangements. On-processor-chip caches have less latency, since they are closer to the processor, but since on-chip area is expensive, such caches are typically smaller than off-chip caches. Off-processor-chip caches have longer latencies since they are remotely located from the processor, but such caches are typically larger than on-chip caches.
A rotator is used in a processor for rotating and transferring a data word, which contains one or more bytes of information, from an origination register to a destination register. In conventional multi-pipeline processors, these registers correspond to pipeline stage latches. Data is xe2x80x9cpipedxe2x80x9d from one pipestage (origination register) to another pipestage (destination register) in preparation for writing it later into the cache.
Based on a control signal (or signals), the register bytes are rotated from 0 to n places in a predefined direction (e.g. to the left) as they are stored in the destination register. In some cases, the rotator may also have the capability to store the rotated word in either a little or big endian format.
FIG. 1 shows first and second 8-byte (64 bit) registers 40 and 60, respectively, for representing an 8 byte register and illustrating the difference between little and big endian formats. Each register includes bytes A through H, where A corresponds to the lowest byte address of the register, and H corresponds to the highest byte address there within. In general, little or big endian refers to which bytes are most significant in multi-byte data types. In big-endian architectures, the leftmost bytes (those with a lower address) are most significant. Conversely, in little-endian architectures, the higher address bytes are most significant.
FIG. 2 shows a block representation of a single Port rotator 70 for rotating and transferring bytes A though H from 8 byte origination register 62 to a corresponding 8 byte destination register 64. Rotator 70 receives an endian mode signal (le/be) at 72 and a rotate control signal (sel[0:7]) at 74. The endian mode signal indicates the appropriate endian (little, big) format for the rotated word. In the depicted scheme it is assumed that the origination register is in a little endian format, i.e., byte A is the least significant byte and byte H is the most significant byte. Thus, if the endian signal is active, the rotated word is also translated into a big endian format before being stored in the destination register. The rotation control signal (which includes sel 0 to sel 7) forms a one-hot byte (or vector) indicating how many places the bytes are to be rotated in a wrap-around rotational scheme. In other words, only one of the 8 bits of the rotation control signal is active. This active bit corresponds to the desired destination location of the least significant byte (which hereafter will be referred to as byte A or byte 0) after the rotation occurs. For example, a rotation control byte value of 00000001 indicates that origination byte A needs to end up in destination byte A (i.e., no rotation). Another example is a rotation control value of 10000000 indicating that origination byte A needs to end up in destination byte H, which corresponds to a seven place leftward rotation. In this example, control bytes 00000010 indicate that origination byte H would end up in destination byte G, origination byte G would end up in destination byte F and so on with origination byte B ending up in destination byte A. Notice how this is equivalent to rotating one place to the right. The direction of rotation is an arbitrary conceptual convention.
FIG. 3 shows a conventional single Port, 8 byte rotator 115 for transferring and rotating data from an origination register 103 to a destination register 133. The origination register 103 has eight origination byte latches (xe2x80x9cO-latchesxe2x80x9d) 103A-103H, and the destination register has eight destination byte latches (xe2x80x9cD-latchesxe2x80x9d) 133A-133H. Rotator 115 includes eight rotator multiplexers 120A-120H and control logic 125. It should be recognized that a byte latch comprises eight individual bit latches for latching a byte of data. However, the latches are represented and addressed in terms of bytes since each of the bits within a given byte (e.g., byte 0) is treated the same with the bytes being manipulated to effectuate the rotation. The same concept applies to the rotator multiplexers 120A-H. That is, they are treated as 8:1 multiplexers for muxing whole bytes of information. However, each multiplexer actually includes 64 inputs with 8 outputs. Each rotator multiplexer 120 has 8 one-hot select inputs for selection of the desired multiplexer path.
All of the 8 outputs from O-latches 103A-H are connected to the inputs of each of the rotator multiplexers 120A-H. In this way, each of the multiplexers can pass through any one of bytes 0 through 7. Each rotator multiplexer output is connected to a corresponding D-latch input. That is, the output of rotator mux. 120A is connected to the input of D-latch 133A, the output of rotator mux. 120B is connected to the input of D-latch 133B and so on with the output of rotator mux. 120H being connected to the input of D-latch 133H. An endian mode select line (le/be) and the rotation control signal (sel[0:7]) are connected as inputs to the control logic 125. Finally, the control logic 125 is connected to each of the rotator multiplexers 120A-H to provide rotation select signals to their select line inputs for determining which byte gets passed through which rotator multiplexer. That is, the control logic in response to the endian mode and rotation control inputs controls each rotator multiplexer so that the appropriate byte will be selected to effectuate the desired rotation in the appropriate endian mode. The control logic 125 provides eight output signals that make up a one-hot rotator select signal that serve as a mux. select for the rotator multiplexers.
Each of these rotator multiplexers get exactly the same select lines from the rotator select signal. In contrast, the data path inputs to each of these multiplexers are configured differently. They each receive all eight origination bytes: byte 0 to byte 7, but they are arranged in a slightly different order to effectuate the various rotation combinations. In operation within the select logic 125, the endian select signal combines with the one-hot select lines to generate the one-hot, 8 mux select outputs. The activated select line corresponds to the amount of rotation. For example, if the first mux select line is activated with little endian format, then no rotation occurs and the bytes are simply passed through to the destination register. If the second line is activated, then with the scheme in the depicted embodiment, a left 7 (or right 1) shift would occur and so on.
These and other objects, features and technical advantages are achieved by a system and method which uses rotator MUX control logic that allows for unequal transactions to be processed. For example a register to memory may require output that has more or less data than the input data.
The inventive control logic determines the starting point for the data transfer by determining which input register byte is going to Byte 0 of the output register. For big endian, the control logic adds the desired location and the size of the data transfer minus one. For little endian, the control logic adds one to the negative of the desired location. This is the 2""s compliment of the desired location. The control logic passes the starting point to single decoder. The decode converts the starting point into a decoded value or bit stream of 0s and a one. The place of the one indicates the starting point of the output bytes. The decoded value is then sent to a plurality of MUXs, one for each of the output register bytes. Each of the MUXs is prewired to receive a portion of bits of the decoded value, and the portion is arranged in a particular order. The size of the portion is based on the number of input register bytes. The order effects a shifting of the decode value. Thus, large numbers of decodes and shifters are not required. The MUXs then send their respective outputs to the rotator MUX as selection control signals.
It is a technical advantage of the invention to provide control logic for a rotator MUX that permits rotation, as well as, mapping of different sized input and output registers.
It is another technical advantage to perform the mapping and rotation without large numbers of decoders and shifters.
It is a further technical advantage of the invention to determine the starting point for the transaction by using 2""s compliment of the desired location for little endian rotations, and using the sum of the desired location and the size of the transaction minus one for big endian rotations.
It is a still further technical advantage of the invention to hard wire the inputs to MUXs to effect shifting of values.