Modem microprocessors may support multi-threaded operation in their architectures. In some cases the multi-threaded operation may be sequential multi-threading, and in other cases the multi-threaded operation may be parallel multi-threading. In either case there are situations where a new thread may need to be spawned or where an existing thread may need to be merged back into the thread that spawned it originally. The process of spawning a new thread may be called a fork operation, and the process of merging a thread back may be called a join operation. Fork and join operations may be coded in an operating system, or alternatively may be placed in executable code by the use of hardcoded fork and join instructions. The rationale for using fork and join operations is to increase performance by the use of the forked-off threads. In some cases the forked thread may be part of non-speculative execution, but in other cases the forked thread may be speculative.
The use of hardcoded fork and join instructions may impact performance in several ways. If the instruction execution in the forked-off thread is correct and if the processor resources are not inadvertently impacted by the forked-off thread, then the performance may be improved. However, if the instruction execution in the forked-off thread is incorrect, or if the processor resources are adversely impacted by the forked-off thread, then the performance may be reduced. It may be possible to consider the execution of a forked-off thread “desirable” in several different ways. It could be if the forked-off thread executed successfully. It could be if the overall processor execution throughput was enhanced. It could be a combination of these two, or it could take into account other measures of desirability.
Software execution could be used to determine whether it would be advantageous to take the fork or not. However, this determination would need to be accomplished prior to the fork, essentially occupying the resources available for both the main thread and the forked-off thread. The use of software determination of whether it would be advantageous to take the fork may use sufficient resources to impact processor performance by itself.