In computer architectures using mass storage devices, such as disk drives, time delays in memory access are imposed by considerations such as disk revolution speeds. It has been a challenge for system designers to find ways to reduce these access delays. A commonly used technique has been to provide one or more regions of high speed random access memories, called cache memory. Portions of the contents of the mass storage are copied into the cache memory as required by the processor, modified, and written back to the mass storage. Cache memories continue to be one of the most pervasive structures found in microprocessors. Effective use of a cache memory can result in substantial performance improvements in microprocessors, which is why many microprocessors now include one or more cache memories in their architecture.
Cache memories are generally organized in “lines”, and they can include hundreds of cache lines. Each line can include a selected block of memory, which may be many bytes in length. In a cache load access, a split cache line access can occur when a data or instruction access crosses over a cache line boundary, which means that part of the desired data resides in one cache line, and the remainder of the desired data resides in another cache line. The existing techniques generally require three or more cycles to complete a split cache line access. In a first cycle, the first part of the data is fetched from the first cache line and stored into an intermediate buffer, often called a split-buffer. In a second cycle, the rest of the data from the other cache line is fetched and also stored in the split buffer. In a third cycle, the split-buffer is accessed to fetch the complete data. Thus, the existing techniques generally require at least three cycles of operations by a microprocessor to complete a split cache line access. The number of cycles required to complete a split cache line access can have a significant impact on the performance of the microprocessor. In order to achieve a higher performance from the microprocessor, it is necessary to reduce the time required to access data during a split cache line access.
Therefore there is a need to reduce the number of cycles required by the microprocessor during the split cache line access to improve the overall performance of the microprocessor.