Computers basically consist of a central processing unit or CPU and a primary storage or memory. The function of the CPU is to execute programs stored in the memory. Each program includes sequences of instructions. Each instruction is associated with a unit of latency. Execution of a program is accomplished by the CPU fetching an instruction stored in the memory, executing the fetched instruction within the CPU, and then proceeding to fetch a next instruction from the memory.
Traditional CPU treats the instructions in a program equally such that each instruction is executed before a next instruction in a logical sequence. For example, the CPU fetches a first instruction in a first cycle, decodes the instruction in a second cycle, and then executes the fetched instruction in a third cycle, before fetching a second instruction in a fourth cycle and repeating the decoding and execution process. These types of processors are referred to as non-pipelined processors.
Modern processors on the other hand have developed what are called pipelines. Pipelines are the most common implementation technique in a CPU today that increases the performance of the system. The idea behind the pipeline is that while the first instruction is being executed, a second instruction can be decoded, and a third instruction can be fetched. Thus, processing of multiple instructions can be overlapped improving overall performance. For example, in a simple three-stage pipeline, instructions are processed by overlapping the fetch, decode and execute phase such that it is possible for the CPU to complete an instruction every cycle as opposed to requiring three cycles per instruction. One problem with pipelining is data dependency. Data dependency refers to a situation when an instruction cannot be executed if its data is not ready.