A processor is a device that reads an instruction stored in an external storage device, analyzes the instruction to perform an arithmetic operation using an operand designated by the instruction, and again stores the instruction in the external storage device, thereby performing a specific function according to a stored program.
The processor is applied to various fields, and performs various and complicated functions. A function of the processor is being used in various application fields such as video encoding/decoding, audio encoding/decoding, network packet routing, system control, etc.
The processor is fundamentally configured with a core, a translation lookaside buffer (TLB), and a cache.
Work performed by the processor is defined as a combination of a plurality of instructions, which are stored in a memory. The instructions are sequentially input to the processor, which performs an arithmetic operation at every clock cycle.
The TLB is an element that converts a virtual address into a physical address, for driving an application based on an operating system (OS).
The cache is an element for enhancing a performance of a system. Also, the cache is a buffer type of high-speed memory unit that stores instructions or programs read from a main memory unit. The cache temporarily stores an instruction (which is stored in an external memory) in a chip, thereby increasing a speed of the processor.
The external memory stores a large-scale instruction of several Gbytes or more (256 Gbytes or more), but a memory implemented in a chip has a capacity of several Mbytes. The cache is an element in which an external large-capacity memory is temporarily equipped in a chip.
The core expends much time of 10 to 100 cycles for reading data from the external memory, and for this reason, an idle state in which the core does not perform work is maintained for a long time.
The cache is an element that considerably affects a performance of the processor. When the core requires a specific instruction but the cache does not include the instruction required by the processor, the instruction should be read from the external memory, and thus, the cache is in the idle state while the instruction is being the external memory, and the processor transfers an address to the cache upon each request. Also, the cache stores a corresponding address (i.e., a tag) as an index for an internally stored instruction code in a tag memory, and whenever the processor requests an instruction code, the cache accesses the tag memory comparing an address and a tag.
In this case, the cache stores a tag corresponding to the stored instruction code. However, when the processor performs a write operation on a tag stored in another cache, the processor reads a write value from a corresponding tag in the other cache. Therefore, when a plurality of the processors are integrated, a tag is stored in a cache memory for each of the processors, and it is required to determine whether a corresponding tag is stored in another cache.
That is, when the plurality of processors are provided, it is required to secure coherency between a plurality of the caches.