1. Field of the Invention
The present invention generally relates to parallel processing and, more specifically, to a parallel architecture that supports dynamic bank address mapping for multi-bank memory accesses.
2. Description of the Related Art
In a single-instruction multiple-thread (SIMT) processing environment, threads are organized in groups of P parallel threads called warps that execute the same program. Although the P threads of a thread group execute each instruction of the program in parallel, each thread of a thread group independently executes the instruction using its own data and registers. Each thread in the thread group is configured to access a multi-bank memory using a fixed mapping of per-thread addresses to the banks of the multi-bank memory. If multiple threads need to access two or more locations in the same bank of memory than can be accessed in a single clock cycle, a bank conflict exists.
Application programs are typically authored to reduce bank conflicts when the parallel threads of a thread group read and write the multi-bank memory so that data for all of the parallel threads in the thread group may be read or written in a single clock cycle. For example, a program may be authored so that either a row or a column of an array of data may be accessed by a thread group without any bank conflicts. When bank conflicts do occur, the accesses for addresses mapped to the same bank must be completed in separate clock cycles, thereby reducing performance.
Accordingly, what is needed in the art is method for avoiding bank conflicts when parallel threads of a thread group access a multi-bank memory.