In media transcoder software, (e.g., decoding and encoding software), a graphics processor unit (GPU) may be used to “load-balance” the encoding process by performing a task such as motion estimation (ME). The GPU performs a search, (e.g., “best search”), comparing two blocks of memory that are already decoded and resident in application memory, a reference frame, and a current frame. When a best match is found, the GPU returns motion vectors that describe, (i.e., point to), the best match in the reference frame.
Historically, ME was performed on discrete GPUs that were resident on a peripheral component interconnect-express (PCIe) or similar system bus. In order to do the motion estimation, the blocks of memory previously described were copied from the application memory, over to the video memory local to the GPU, (e.g., on the PCIe bus). The GPU operates on this video memory. In systems where the GPU resides on a video card that is only accessible over an external bus, such as an accelerated graphics port (AGP), peripheral component interconnect (PCI), or the PCIe bus, the GPU is separate from the motherboard and CPU, and does not have access directly to the system memory. In a system having an integrated graphics processor (IGP), (e.g., a system including an Advanced Micro Devices, Inc. (AMD) RS880 chipset), the GPU may be integrated into the motherboard and has access directly to the system memory via a physical connection to the central processing unit (CPU), which may include the system's main memory controller. Accordingly, the IGP may access the system memory via the CPU or the video memory that is attached directly to the IGP.
Accordingly, it would be beneficial to provide a method and apparatus that is capable of data copy elimination by using the system memory, for example where an IGP uses the same dynamic random access memory (DRAM) as the CPU.