1. Field of the Invention
The present invention relates to a processor and method thereof and more particularly, to a processor with cache way prediction and method thereof.
2. Description of the Related Art
Lower power consumption may be a desirable characteristic in electronic devices (e.g., portable electronic devices). Electronic devices (e.g., portable electronic devices) may include embedded processors. Power consumption and an operating speed of an embedded processor in an electronic device may be factors in determining the performance of the electronic device.
FIG. 1 illustrates a conventional processor 100 capable of performing branch prediction. Referring to FIG. 1, the processor 100 may include a fetch unit 110, an instruction cache 120, an instruction decoder 130, an execution unit 140 and a branch prediction unit 150.
For each clock cycle of the processor 100, the fetch unit 110 may fetch a fetch address FADR from a memory (not shown) in response to a program counter (PC) signal. The fetch unit 110 may determine an address to be fetched from the memory for a subsequent clock cycle (e.g., a next clock cycle). The branch prediction unit 150 may determine whether an instruction associated with the fetch address FADR is a branch instruction. If the branch prediction unit 150 determines the fetch address FADR a branch instruction, the branch prediction unit 150 may further determine whether the branch instruction is predicted to branch. If the fetch address FADR a branch instruction predicted for branching, the branch prediction unit 150 may output a branch target address TADR. The fetch unit 110 may select an address to be fetched from the memory (not shown) from among the branch target address TADR and a plurality of candidate addresses in a subsequent clock cycle (e.g., a next clock cycle) and may transmit the selected address to the instruction cache 120.
The instruction cache 120 may access data (e.g., tags, instructions, etc.) stored in the selected address received from the fetch unit 110 and may output the stored data (e.g., from a data block which matches the selected address received from the fetch unit 110). The data output from the instruction cache 120 may be received by the instruction decoder 130 as a fetch instruction. The instruction decoder 130 may decode the fetch instruction and may determine an instruction type (e.g., a branch instruction) specified by the fetch instruction. The execution unit 140 may execute the fetch instruction.
If the instruction cache 120 is a set associative cache, the instruction cache 120 may include a plurality of cache ways, where each of the plurality of cache ways includes a tag memory and a data memory. The instruction cache 120 may perform a tag matching operation by accessing each of the plurality of cache ways in response to the selected address (e.g., received from the fetch unit 110) and may output data stored in a block which matches the tag associated with the selected address. For example, if the instruction cache 120 is a four-way set associative cache, the instruction cache 120 may select data in four addresses by accessing four cache ways to determine a tag match and may then re-select the address including the matching tag. Each of the four cache ways may include a tag memory and a data memory. In the above-described example conventional method, the instruction cache 120 may consume a higher amount of power when accessing the cache ways of the instruction cache 120 (e.g., because each of the cache ways may be accessed).