The present invention relates to executions in a processor and more specifically to a method and system to detect a stall in a thread of instructions in a processor.
Modern computer systems typically contain several integrated circuits (ICs), including a processor which may be used to process information in the computer system. The information processed by a processor may include computer instructions that are executed by the processor as well as data, which is manipulated by the processor using the computer instructions. The computer instructions and data are typically stored in a main memory in the computer system.
Processors typically run programs or processes by breaking them down into instructions and executing the instructions in a series of small steps. These processes broken down into a series of small steps may form one or more threads. A thread is a sequence or collection of program instructions that together perform a specific task. A thread may also be referred to as a streams of instructions. The threads may be instruction streams from different parts of the same program executing on the processor, or may be from different programs executing on the processor, or combinations thereof.
In some cases, to increase the number of instructions being processed by the processor (and therefore increase the speed of the processor), the processor may be pipelined. Pipelining refers to providing separate stages in a processor where each stage performs one or more of the small steps, e.g., instructions, necessary to execute a thread, i.e., several instructions are overlapped in execution. In some cases, the pipeline (in addition to other circuitry) may be placed in a portion of the processor referred to as the processor core. Some processors may have multiple processor cores, and in some cases, each processor core may have multiple pipelines. Where a processor core has multiple pipelines, groups of instructions may issue to the multiple pipelines in parallel and be executed by each of the pipelines in parallel.
Processor designs commonly have more than one hardware thread. The hardware threads while being architecturally independent, often share resources in the processor. For example, registers, execution units, buses and pipelines may be commonly shared. Sharing resources fairly is a difficult challenge. Logic must be developed to arbitrate priority between the threads for access to the shared resources. A difficult problem is determining when one thread's activity is causing another thread to be starved from accessing shared resources. It is relatively easy to detect if a thread is stalled forever as the thread will not make any forward progress and will hang. Detecting cases where a thread is stalled for a significant number of cycles, but then eventually gains access to the resource is more difficult to detect. For example, in processors, the Arithmetic and Logic Unit (ALU) may take a number of cycles, which may include gaining access to data in memory, to perform its operations. Detecting whether or not the thread is stalled in the pipeline, for example, may be more difficult to detect. Detecting and fixing stall conditions will improve the performance of the threads, and the performance of the processor.