1. Field of the Invention
The present invention relates to methods and apparatus for processing memory access instructions. More particularly, one aspect of the invention relates to methods and apparatus for combining data from a plurality of memory access transactions, such as store pair transactions or instructions, and writing the combined data to memory in a single memory access transaction.
2. Description of the Related Art
Modern computers are typically equipped with several basic components: one or more processors, main memory, cache memory, and a memory controller. In one conventional configuration of such a computer, the processor connects to the cache memory and to the memory controller. The cache memory is also connected to the memory controller. The memory controller is connected to the main memory through one or more memory buses (e.g., a memory data bus and a memory address bus).
Because cache memory is characteristically higher in performance than main memory, the processor accesses data from the cache memory, rather than main memory, whenever possible during normal operation. This may require, from time to time, transferring data between the cache memory and the main memory. Such data transfers often occur in bursts where blocks of data are transferred at a time. For example, the cache memory may transfer data from a plurality of cache lines to the memory controller to be written in main memory. A xe2x80x9ccache linexe2x80x9d refers to a unit by which data is organized in the cache memory and is typically thirty-two bytes (four 8-byte words or xe2x80x9cbeatsxe2x80x9d) in length.
Although the processor accesses data from the cache memory, the processor may also access data from the main memory. One specific example of an instruction to write data to main memory is referred to as a xe2x80x9cstore pair instruction.xe2x80x9d In executing a store pair instruction, the processor (or a component of the processor) fetches one beat of data to be written in main memory and the address at which the data is to be written in main memory. The processor then translates the instruction, the fetched data, and the fetched address into a memory store instruction. The processor transmits the memory store instruction to the memory controller for execution.
To facilitate the processing of memory store instructions by the memory controller, memory store instructions normally follow a fixed format. Although the exact format of the bus write command may vary depending upon the memory controller and the processor used, a typical memory store instruction contains the following information: (1) a write command; (2) a fixed number of beats of data (usually representing a full cache line); (3) byte enable information specifying which or how many of the bytes in the fixed number of beats are to be actually written in memory; and (4) the address in main memory at which the specified bytes are to be written. Where a memory store instruction is generated directly from a store pair instruction, the byte enable information specifies that bytes from only one of the fixed number of beats contained in the write transaction is to be written in memory.
Generating a memory store instruction for each store pair instruction, however, results in wasted bandwidth. Such a memory store instruction causes only a single beat of data to be written in memory, even though the format of a memory store instruction allows up to four beats of data to be written. Accordingly, there is a need to reduce or eliminate wasted memory data bus bandwidth caused by execution of memory store instructions generated directly from store pair instructions.
Methods and apparatus consistent with the present invention reduce or eliminate wasted memory data bus bandwidth by combining certain memory store instructions before writing data into main memory.
In accordance with the invention, as embodied and broadly described, a system consistent with this invention comprises a method comprising the steps of receiving a first instruction to write a first data word at a first address of memory; receiving a second instruction to write a second data word at a second address of memory; determining whether the first instruction and the second instruction include data from the same cache line; and generating a combined instruction to write the first and second data words in the first and second addresses of memory, respectively, if the first instruction and the second instruction are determined to include data from the same cache line.
In another aspect, the invention comprises an apparatus comprising a device for receiving a first instruction to write a first data word at a first address of memory; a device for receiving a second instruction to write a second data word at a second address of memory; a device for determining whether the first instruction and the second instruction include data from the same cache line; and a device for generating a combined instruction to write the first and second data words in the first and second addresses of memory, respectively, if the first instruction and the second instruction are determined to include data from the same cache line.
Both the foregoing general description and the following detailed description are exemplary and explanatory and are intended to provide further explanation of the invention as claimed.