Field of the Invention
This invention relates generally to the field of computer processors. More particularly, the invention relates to an apparatus and method for efficient execution of nested branches on a graphics processor unit.
Description of the Related Art
Managing control flow in single instruction multiple data (SIMD) programs is a complex problem. Traditionally, graphics processing units (GPUs) use scalar code and program routines to control instruction pointer (IP) addresses for each SIMD channel. This is inefficient both in terms of performance and power usage.
Control flow is managed on some architectures by maintaining a unique IP address for each channel. For example, when a control flow instruction is encountered, the IP of each channel is updated with a particular IP based on the predicate mask of the instruction. For each and every instruction, the execution IP is compared to the channel's IP to determine if that channel is enabled for a particular instruction at the current IP.