The present invention is directed to a method and apparatus for memory management in a processing system. More particularly, the present invention is directed to a method and apparatus for maintaining shadow copies of data in a multi-processor, multi-memory system or in a single processor system where there is a desire to enhance processor performance.
As processor systems have been designed to perform more complex operations at a faster speed, it has been more common to introduce more processing capabilities to a system by way of employing multiple processing units, some of which have specialized functionality. One arena in which this architecture has been adopted has been in the processing of graphics information.
A schematic diagram providing a high level view of a multi-processor arrangement and a graphics processing system is provided in FIG. 1. A central processing unit 101 is responsible for overseeing the operations of the system as a whole, including the running of various applications on the graphics processing system. A graphics processing unit (GPU) 107 is also provided. This GPU a specialty processor adapted to perform functionality particularly germane to certain graphics requirements for the system. Graphics processing unit 107 has associated therewith a local memory 109. By this arrangement the graphics processing unit can more quickly access information and data necessary graphics for processing without having to access information from the main memory. This improves overall system performance. Furthermore in the arrangement of FIG. 1, a bridge processing unit 105 acts as a bridge or mediator between the CPU controlling the overall operation of the system and the applications run by the system, and the special processing operations being performed by the GPU. In a further system enhancement, developers have taken to including a co-processor such as co-proc 1051 in bridge 105 thereby allowing for certain additional processing operations typically associated with the GPU to be implemented in the bridge and thereby further improve system performance capabilities. A main or system memory 103 is also provided.
In the arrangement of FIG. 1, it is appropriate to have the coprocessor 1051 interoperate with data stored in main memory 103 rather than have it operate with local memory 109. To do the latter would be to decrease system performance by loading additional responsibilities for data transfer on to the GPU 107 thereby diverting time from data processing operations to transaction processing. Therefore, it is more advantageous for the co-proc 1051 to rely on and/or interoperate with the main memory 103 while the GPU 107 is interoperating with data in local memory 109.
When the coprocessor and GPU reference different memories issues can arise with regard to assuring that the coprocessor and GPU are operating on the same data. That is, it is significant that where the co-proc and GPU are intending to be operating on the same information, that the system assure that the information to be used by the co-proc which resides in the main memory matches up with or is consistent with the data that is being used by the GPU stored in local memory.
The potential for differences between the data sets operated on by the coprocessor and a GPU rises significantly where data is produced by or provided by the CPU. An example of this situation is illustrated in a schematic form in FIG. 2 of the application referring to data processed in a graphics processing context. More specifically, in FIG. 2 illustrates a CPU 201 that is responsible for producing information that populates a vertex buffer 202 which is normally utilized in a graphics processing operation. The content of the vertex buffer 202 may be of interest not only to the graphics processing unit 207 but also to the coprocessor solution would be to provide a complete copy of the vertex buffer from the CPU to both the coprocessor and the graphics processing unit. However, it has been determined that such an operation has a negative impact on overall system performance because a substantial amount of transaction processing is involved in writing the information twice, to two separate locations. The negative impact is exacerbated when one considers that some of that additional transaction processing time is expended to write data that is simply not used by the coprocessor at all. Thus the negative affect of the copying of all of the data is amplified by a factor dependent on the extent to which the co-processor uses that entire set of data. Thus it would be beneficial if a technique were provided that would assure data consistency for the operation of the specialty processor, in this graphics environment the GPU, and a coprocessor while at the same time reducing the negative impact on system performance.