1. Field
This patent application relates generally to data caching and more specifically to managing write commands in a cache of a virtual machine.
2. Description of Related Art
In computing systems, a cache is a memory system or subsystem which transparently stores data so that future requests for that data can be served faster. As an example, many modern microprocessors incorporate an instruction cache holding a number of instructions; when the microprocessor executes a program loop where the same set of instructions are executed repeatedly, these instructions are fetched from the instruction cache, rather than from an external memory device at a performance penalty of an order of magnitude or more.
In other environments, such as where a computing system hosts multiple virtual machines with each virtual machine running one or more applications, computing system-side caching of objects stored on a network attached storage system can provide significant performance improvements. In some instances, records are simultaneously cached and written to a network attached storage system according to a “write-through” algorithm. In other instances, records are cached and then written to the network attached storage system according to a “write back” algorithm. In the “write back” algorithm, the received record is written to the cache before being written to the network attached storage system. The cache system can then direct the writing of the record to the network attached storage system.
Caching can create issues when the order of the application's incoming write data is not preserved at the time the “write back” algorithm writes out said cached data to longer-term storage, such as a storage area network (SAN). When not ordered correctly, records can become shuffled, leading to application failures or inconsistencies. In some prior art systems, an in-place caching system is used. FIG. 1 depicts an example process for caching records using in-place caching. The shading of the record corresponds to a SAN memory location where the record is to be stored. As depicted in FIG. 1, four records (A 102, B 104, C 106, and D 108) are received in chronological order (as also indicated by the alphabetical labelling of the records). As indicated by the shading of the records, the first record A 102 and the fourth record D 108 are both to be stored in a same SAN memory location. According to existing in-place caching systems, each memory location in a cache 110 is assigned to a corresponding SAN memory location. As such, when record A 102 is received, it is written into a first cache memory location, and when the record D 108 is received, record A 102 is overwritten with the record D 108. If the record A 102 had not yet been written to a SAN 112 before being overwritten, the record A 102 has effectively not happened. This can become a problem if record B 104 and record C 106 (following record A 102) are not written to the SAN 112 before record D 108 is written to the SAN 112. In this case, previous records, X 114 and Y 116, remain in the SAN memory locations corresponding to the records B 104 and C 106 when record D 108 is written to the SAN 112. This means that, instead of the SAN 112 containing records B 104, C 106, and D 108 reflecting the order that they were received, the SAN 112 contains records X 114, Y 116, and D 108 even though the concurrent storage of these three records is not consistent with their chronology. When records are shuffled in this way, the data is unreliable, operations may be lost, and applications may fail.
While other write-back caching algorithms exist, caching and retrieving data quickly and accurately remains a challenge.