Since the invention of new computer systems, there has always been a race for faster processing and faster systems. Faster processors have been created with exponential growth in clock speed. Naturally, the volume of data and instructions has gone up quite rapidly too. In a computer system, there are storage devices such as ROM (read-only memory), and burst based storage devices, e.g. DRAM, for data and instruction storage with increasingly higher capacities. Structurally, large memory spaces are deep, and they could slow down the processor access to data and instructions in the memory. This problem has created a need for a more efficient memory management and the creation of cache memory and cache memory structure. A cache memory is generally a shallow and wide storage device, inside or close to a processor that facilitates processor's access to the data and content change of the data. The philosophy of cache memory management is to retain copies of data and instructions which are often used, or are most likely to be used in near future by the processor, inside the fastest accessible storage device. This makes the access of a processor to data and instructions many times faster than to otherwise access them in an external memory. However, care must be taken in such operations as changing content in cache memory and in external memory should be harmonized. These issues, with their hardware and software features, have created the art of cache memory structure and management.
As mentioned, a cache memory keeps copies of data and address pointers that are most likely to be accessed next by the processor. An external memory typically holds data in capacitors and needs refresh cycles to replenish the charge on the capacitors to prevent the loss of data. A typical cache memory, however, uses eight transistors to represent one bit, and as such, does not need refresh cycles. A cache memory therefore has much less storage space than an external memory per unit size. Accordingly a cache memory can contain much less data than an external memory. As a result, data and instructions must be selected carefully to optimize cache operations.
Different policies and protocols are used to optimize cache memory operation. Most well known among these are direct mapping, fully associative, and set-associative. These protocols are known to people skilled in the art. They serve the general purposes of computing, including data processing, web based applications, etc. U.S. Pat. No. 4,295,193 to Pomerene presents a computing machine for concurrently executing instructions compiled into multi-instruction word. It is one of the earliest patents alluding to cache memory, address generators, instruction registers, and pipelining. U.S. Pat. No. 4,796,175 to Matsuo presents a microprocessor with instruction queue for pre-fetching instruction form a main memory and an instruction cache. U.S. Pat. No. 6,067,616 to Stiles presents a branch prediction cache (BPC) scheme with hybrid cache structure, a fully associative wide and shallow first level BCP, a second deep and narrow direct mapped level BCP with partial prediction information. U.S. Pat. No. 6,654,856 to Frank presents a cache management system in a computer system, wherein, an addresswise circular structure of the cache memory is emphasized.
U.S. Pat. No. 6,681,296 to Liao presents a microprocessor with a control unit and a cache, which is selectively configurable as single or partitioned with locked and normal portions. U.S. Pat. No. 6,721,856 to Arimilli presents a cache with coherency state and system controller information of each line with different subentries for different processors containing a processor access sequence. U.S. Pat. No. 6,629,188 discloses a cache memory with a first and a second plurality of storage spaces. U.S. Pat. No. 6,295,582 discloses a cache system with data coherency and avoiding deadlock with substantially sequential read and write commands. U.S. Pat. No. 6,339,428 discloses a cache apparatus in video graphics where compressed texture information are received and decompressed for texture operations. U.S. Pat. No. 6,353,438 discloses a cache organization with multiple tiles of texture image data and directly mapping of data into cache.
Each of the above inventions offers certain advantages. An efficient cache structure and policy depends strongly on the specific application at hand. In digital video applications, digital image processing in real time and with high quality is one of the great challenges of the field. Specifically one needs to perform detailed two-dimensional image processing with simultaneous nonlinear coordinate transformations. A dedicated and specialized system is therefore needed with unique advantages providing fast access with data coherency. Accordingly it is necessary to optimize the cache structure and cache management policy for this application.