Multi-processor computer systems include a number of processing nodes connected together by an interconnection network. Typically, a processing node includes one or more processors, a local memory, a cache memory and an interface circuit connecting the node to the interconnection network. The interconnection network is used for transmitting packets of information between processing nodes.
In computer systems it is important to minimize the time necessary for processors to access data. In a distributed memory system, communication costs in reading data from remote memory locations can be excessive. To solve this problem, computer memory systems generally use a memory hierarchy in which smaller and faster memories are located within a few machine cycles of the processors and larger and slower memories are located a larger number of machine cycles away. Cache memories are smaller, faster memories that contain a copy of memory data that is used more often by the processors. Data in a cache memory is stored in memory blocks that contain both the data and a tag that identifies the data. If the desired data is not located in cache memory, a cache miss occurs and the data is fetched from local memory.
However, there isn't always enough room in each local memory to store all global variables. Higher communication costs are incurred if a global variable for use on a first processing node resides in local memory on a second processing node. What is needed is an improved method of data access in a multi-processor node system to optimize system performance.