When displayed for a user of a computer system, data representing a file or other collection of related information is presented with an organization which is easily understood by the user. For example, a text file is presented such that the user can read the text. However, the data in the memory representing the text file has a different organization. In order to optimize memory usage, the data representing the text file is fragmented such that a single text file may be represented by many data fragments, where the data fragments may be dispersed throughout the memory. Accordingly, each data fragment has a physical memory address, representing the physical location of the fragment. In addition, each fragment has a logical memory address, representing the logical position of the data fragment within the file. When data is accessed, the physical memory address may be determined based on the logical address using metadata, which includes a mapping tree. Accordingly, for each file or related group of files there are two sets of data. The first set being the data representing the file, and the second set representing metadata which includes the mapping tree and which is used to preserve the logical order of the data fragments.
In some cases, each file or related group of files may be large and therefore require large amounts of memory space. If a copy of such a file is desired, a duplicate of both sets of data may be generated. If the file to be copied is large, generating a duplicate by copying both sets of data is time consuming and requires large amounts of memory.
To address this problem, clones are sometimes used. In general, a clone is similar to a copy, in that it may be edited and used like a copy. A clone, however, may not have a duplicate of the first set of data, representing the information in the file. When a clone is generated, the data representing the file of the clone and the parent are identical. Therefore, to avoid redundant data within the memory, the first set of data need not be duplicated. The clone has a duplicate only of the second set of data, or metadata. The metadata of the clone, points to the first set of data of the parent. As the clone is edited, data representing the modified clone is generated, and the metadata of the clone is modified so as to point to both the first set of data and to the generated data of the modified clone. As a result, the metadata of the clone is modified such that portions of the metadata corresponding with modified portions of the clone point to the generated data of the modified clone, and portions of the metadata corresponding with unmodified portions of the clone point to the first set of data of the parent. Accordingly, using a clone eliminates the need to duplicate the first set of data. Therefore, redundant data is avoided, and significant time and memory storage is saved.
Clones are particularly useful in data backup and protection systems. Such systems often require large amounts of memory and efficient an optimal use of the memory results in numerous benefits to the systems. Clones allow for the efficient an optimal memory utilization resulting in low cost, low power usage, high speed, and good reliability.
Such clones, however, generate redundant metadata by copying the second set of data of the parent. Accordingly, such clones waste significant time and memory storage.