One or more aspects of the invention generally relate to data processing, and more particularly to register allocation in multiple-bank, single-port memories.
The demand for increased realism in computer graphics for games and other applications has been steady for some time now and shows no signs of abating. This has placed stringent performance requirements on computer system components, particularly graphics processors. For example, to generate improved images, an ever increasing amount of data needs to be processed by a graphics processing unit. In fact, conventional processing techniques are not up to this task and need to be replaced by improved processing techniques.
One such improved technique employs multiple single-instruction, multiple-data processors. These new techniques allow these processors to simultaneously execute hundreds of threads.
Current data processing includes systems and methods developed to execute program instructions, including instructions without operands, or with one or more operands. The operands are stored in register files within the processor for access during the execution of a program. Some program instructions, such as multiply and multiply-accumulate, specify two or more operands. Conventionally, a register file is implemented using a multiported memory so that two or more locations, each location storing an operand, may be read in a single clock cycle.
Compared with a multiported memory, a single ported memory consumes less die area and power. However, unlike a multiported memory, only a single location may be read in each clock cycle. Therefore, two or more clock cycles are needed to acquire the operands needed to execute some program instructions, reducing performance compared with a multiported memory.
Accordingly, it would be desirable to provide memory structures and register allocation methods that provide the die and power savings of a single-ported memory while retaining the performance advantages of a multiported memory.