1. Field of the Invention
The present invention generally relates to program execution and more specifically to a unified stream multiprocessor memory.
2. Description of the Related Art
Conventional graphics processing units (GPUs) use a large number of hardware execution threads to hide both function unit pipeline latency and memory access latency. Local memories that hold operands and provide operand bandwidth are a major consumer of area and power in modern processors of all kinds. Typically separate memories are used to hold registers, cached data, explicitly local data, constants, and the like. Providing separate memories separates functionality but increases overhead and decreases utilization of both capacity and bandwidth because unused capacity or bandwidth from one memory cannot be used for the other classes of data.
Accordingly, what is needed in the art is an improved system and method for providing storage for the execution threads.