1. Field of the Invention
Embodiments of the present invention generally relate to performing read operations and write operations. More particularly, embodiments of the present invention relate to performing read operations and write operations utilizing an architecture for compact multi-ported register file.
2. Related Art
Register files are utilized in a variety of electrical applications. Typically, the register files are used to store the operands for an instruction to be executed. Register files are incorporated in some hardware devices. An example of a hardware device having register files is a graphics processing unit (GPU). The GPU is a semiconductor device that specializes in rapidly processing graphical data compared to a typical central processing unit (CPU).
Within the GPU, there is a graphics shader that performs numerous operations on graphical data to obtain desired optical features and that interfaces with a texture unit. The texture unit further modifies the graphical data to have desired texture and optical features. In some implementations, the texture unit is implemented as part of the graphics shader. Generally, the fabricated GPU utilizes most of the semiconductor area available on the semiconductor chip die. In particular, the graphics shader uses a significant portion of the semiconductor area. Moreover, the processing speed of the GPU is measured by the amount of graphical data that is processed in any time period. Further, the amount of graphical data that can be processed by the graphics shader substantially affects the processing speed of the GPU. Hence, improvements in processing by the graphics shader lead to performance enhancements for the GPU.
Typically, the graphics shader processes data groups of graphical data. The size of these data groups depends on various factors. These data groups can have one or more fragments. A fragment includes a variety of information such as associated pixel location, a depth value, and a set of interpolated parameters such as a color, a secondary color, and one or more texture coordinate sets. If the fragment passes though the various stages of the graphical pipeline of the GPU, the fragment updates a pixel in the frame buffer. That is, the fragment can be thought of as a “potential pixel”. In processing these data groups of fragments, the graphics shader needs to perform read operations and write operations. Typically, the graphics shader includes one or more register files that perform the read operations and the write operations. Usually, each register file is designed to handle the number of read and write operations requested during the processing of the data group of fragments. As an example, the register file may be designed to handle two read (2R) operations and two write (2W) operations.
FIG. 1A illustrates a first conventional register file 100A that handles two read (2R) operations and two write (2W) operations. As depicted in FIG. 1A, the first conventional register file 100A includes a 4-port RAM (Random Access Memory) 50. Typically, the first conventional register file 100A also has one or more registers. The 4-port RAM 50 has a first read port 10, a second read port 12, a first write port 14, and a second write port 16. These ports 10-16 receive the address of the memory location to be read from or to be written to. Moreover, the 4-port RAM 50 has data inputs for the write operation, data outputs for the read operation, and control inputs for control signals.
Although the first conventional register file 100A can simultaneously handle two read (2R) operations and two write (2W) operations, the use of multiple ports increases the size of the 4-port RAM 50 (and of the register file 100A) and reduces the memory capacity of the 4-port RAM 50 (and of the register file 100A).
FIG. 1B illustrates a second conventional register file 100B that handles two read (2R) operations and two write (2W) operations. As depicted in FIG. 1B, the second conventional register file 100B includes a first dual-port RAM 51 and a second dual port RAM 52. Typically, the second conventional register file 100B also has one or more registers. The first dual-port RAM 51 includes a first read port 40 and a first write port 42. The second dual-port RAM 52 includes a first read port 44 and a first write port 46. These ports 40-46 receive the address of the memory location to be read from or to be written to. Moreover, the dual-port RAMs 51 and 52 have data inputs for the write operation, data outputs for the read operation, and control inputs for control signals. An even/odd bank implementation enables the dual-port RAMS 51 and 52 to handle two read (2R) operations and two write (2W) operations, wherein each dual RAM performs one of the read operations and one of the write operations.
Use of dual-port RAMs 51 and 52 decreases the size of the register file 100B and enables an increase in memory capacity compared to the register file 100A. Unfortunately, the even/odd bank implementation requires additional complicated logic circuitry and software to avoid bank conflicts, which occur when two read or two write operations request access to the same bank. Also, increases in the memory capacity of the register file 100B require the addition of two dual-port RAMs even if a single dual-port RAM would be sufficient, leading to unnecessary resources being included in the register file 100B.