1. Technical Field
The present invention relates generally to data processing systems and in particular to data operations within data processing systems. Still more particularly, the present invention relates to operations that move memory data during processing on a data processing system.
2. Description of the Related Art
Standard operation of data processing systems requires access to and movement and/or manipulation of data by the processing components. Application data are typically stored in memory and are read/retrieved, manipulated, and stored/written from one memory location to another. Also, the processor may also perform a simple move (relocation) of data using a series of load and store commands issued by the processor when executing the application code.
With conventional data move operations, the processor transfers data from one memory location having a first physical (real) address to another location with a different physical (real) address. Completing the data move operation typically involves a number of steps, including: (1) the processor issues a particular sequence of load and store instructions, which result: (a) a TLB performs an address translation to translate the effective addresses of the processor issued operation into corresponding real address associated with the real/physical memory: and (b) a memory or cache controller performing a cache line read or memory read of the data; (2) the TLB passes the real address of the processor store instruction to the memory controller (via a switch/interconnect when the controller is off-chip); (3) the memory controller acquires a lock on the destination memory location (identified with a real address); (4) the memory controller assigns the lock to the processor; (5) the processor receives the data from the source memory location (identified with a real address); (6) the processor sends the data to the memory controller; (7) the memory controller writes the data to the destination location; (8) the memory controller releases the lock on the destination memory location; and (9) a SYNC completes on the system fabric to inform the processor that the data move has finally completed.
Inherent in the above process are several built-in latencies, which forces the processor to wait until the end of most of the above processes before the processor may resume processing subsequently received instructions. Examples of these built in latencies include: (a) the TLB having to convert the effective address (EA) of the operation to the corresponding real address via the TLB or ERAT to determine which physical memory location that EA is pinned to; (b) the memory controller retrieving the data from the source memory location, directing the sourced data to the processor chip and then forwarding the data from the processor chip to the destination memory location; and (c) and lock acquisition process.
The lock acquisition process and issuance of the SYNC prevents overwrite of the data during the data move operation. The SYNC instruction at the end of the data move process ensures that the memory subsystem retains the data coherency exists among the various processing units.
However, a large portion of the latency in performing data operations, such as with memory moves, involves the actual movement of the data from the first real address location (the source location) to the second real address location (the destination location). During such movement, the data is pinned to a specific real address to prevent the occurrence of a manage exception. The processor has to wait on completion of the address translation by the TLB and acquisition of the lock before proceeding with completing the operation and subsequent operations. Developers are continually seeking ways to improve the speed (reduce the latency) of such memory access data operations.