Modern three-dimensional (3D) graphics processors are typically simple reduced instruction set computing (RISC) designs that are equipped with very large register files and registers having 256, 512 or more bits. Such RISC processors are then connected in so-called vector or single instruction multiple data (SIMD) rows that share an instruction cache and sometimes also a data cache. Such architectures can effectively process simple shader programs used in 3D rendering techniques. However they have problems in function calling, as to implement a function call stack they would need to transport very large register values (at least 256 bits wide) to and from memory (as there is typically a very limited local data cache). Also current state-of-the-art SIMD architectures are not equipped with any kind of function call stack. Thus current SIMD architectures do not perform function calls, and instead inline functions into code and consequently they support only a limited depth of function calling. In such case the code size grows very quickly, worsening utilization of the instruction cache.