In parallel computing, low memory access latency is very important and challenging because multiple processor cores need to access memory simultaneously. In traditional multi-core processors, caches are used to reduce memory access latencies. However, cache coherence protocol needs to be implemented to provide a correct memory model to the programs. At large scales, cache coherence protocol for managing multiple cache systems may be very cumbersome and inefficient. It is desirable to improve the way cache is managed.