1. Field of the Invention
Embodiments of the present invention generally relate data transfer within a computing environment and, more particularly, peer-to-peer data transfer within such a computing environment.
2. Description of the Related Art
In modern computing environments, a multitude of devices are generally interconnected to provide processing speed and flexibility within the computing environment. To create such a computing environment, various devices are connected to one another via an interconnectivity fabric such as a network or bus structure. The devices connected to the interconnectivity fabric generally contain local memory that is used by a device during a computation.
One example of such a computing environment is used for graphics processing, where a plurality of graphics processing units (GPUs) are connected to one another by an interconnectivity fabric and each GPU is coupled to a frame buffer (i.e., local memory). The frame buffer stores graphics data being processed by the individual GPUs. Generally, large amounts of data need to be processed by the GPUs to render textures and create other graphics information for display. To achieve rapid processing, the processing task is divided amongst GPUs such that components of the task are performed in parallel.
At times, in such a computing environment, the graphics processing units may be required to utilize information that is stored in the frame buffer of a peer GPU or may be required to write information to a frame buffer of a peer GPU such that the peer GPU may locally utilize that information. Presently, implementations of many interconnectivity fabric standards such as AGP, PCI, PCI-Express™, advance switching and the like enable peers to write information to another peer's address space, but do not enable reading of information stored in another peer's address space. Consequently, the graphics processing units will duplicate effort to create data that their peers have already created because they do not have access to a peer's frame buffer in peer address space where that information is stored. Alternatively, a GPU may complete processing of certain information; if that information is needed by a peer, the creator may write that information to a commonly available system memory. The system memory is accessible by any of the GPUs connected to the interconnectivity fabric. However, using common system memory for such data transfers is time consuming and increases overhead processing. Invariably, the use of common system memory slows the graphics processing.
Therefore, there is a need in the art for an improved method and apparatus of transferring information from peer-to-peer within a computing environment.