1. Technical Field
This invention relates to updating content stored in a storage device. More specifically this invention relates to in-place updating an original version of content in a non-volatile storage to an updated version.
2. Discussion of Related Art
It is sometimes required to update content stored in a storage device. For example, if the content is software, or a program (such as an executable file), it is sometimes required to fix a bug existing therein or introduce new features thereto. Yet, the latter example is non-limiting and other types of content may also require updates, such as text, data stored in a database, etc. The terms “old version” or “original version” refer to a version of content before update, and the terms “new version” or “updated version” refer to a version that includes already updated content. In other words, an original version includes “original content” while an updated version includes “updated content”. It should be noted that updated content can be further updated. In case of a second update, for example, the updated content of the first update turns to be original content of the second update while new updated content is generated by the second update etc.
A process during which original content is updated yielding updated content is referred to as an “update process”. The update process usually requires instructions on how to perform the update. Such instructions constitute together an “update package”, wherein each instruction included therein constitutes an “update command”. That is an update package is obtained as input, and during the update process, original content is updated to updated content in accordance therewith. This is non-limiting though and sometimes more than one update package can be obtained which together allows the updating of content. Alternatively, instead of an update package being obtained, an update package (or a set of update commands) may be retrieved from a storage or from a database etc. Hence, hereinafter, when referring to the term “obtaining an update package” it should that the update package can be passively obtained or actively retrieved or sometimes an embedded package (e.g., a hard coded set of update commands) can be activated.
One way to update an original version to an updated version is storing the updated version in the storage in addition to the original version. For example, a computer program “prog.exe” is activated whenever a user presses a certain icon on the PC (Personal Computer) windows desktop. In order to update prog.exe it is possible to store the updated version of this file in a different location than the present (original) version, and then reset the path associated with the icon so as to activate the updated version instead of the original version. Later, when it is ascertained that the update process completed successfully, the original version can be deleted safely, releasing the space occupied thereby. In addition to increasing storage consumption, this latter update method requires that the complete updated version be provided to the update process, e.g., in the update package. Such an update package easily becomes huge in size, and if it is required to transmit it to the updatable device via band-width limited communication channels, transmittance may become cumbersome and sometimes even impossible. Therefore, it is preferable that the size of the update package be reduced along with reducing the device's storage consumption.
Another update method, which storage-wise is preferable to the latter method mentioned above, requires transmitting the complete updated version in the update package and simply overwriting original content with updated content. This update method may turn out to be risky and non-reliable, because if the update process fails in the middle of operating, when part of the original version is already overwritten, while only part of the updated version is written to the storage, it is appreciated that the version stored in the storage at the time of interruption may be invalid or inoperable. In this case, provided that the update package is still accessible, the update process may be restarted from the beginning. It is noted that updating content by overwriting the original content with the updated content is commonly referred to in the art as “in-place update”, and the like.
One way for reducing the size of an update package is by including in it information representing the differences between the original and updated content. Such an update package is sometimes referred to also as a “difference”, a “difference result” or a “delta”. The update process, upon operating in accordance with a delta, applies it to the original content, hence producing the updated content. Deltas may be produced using the known in the art differencing algorithms (such as “GNU diff”) in a naive manner, though such deltas tend to be rather large.
The size of the delta being considered, there are methods trying to reduce the size thereof. For example, U.S. Pat. No. 6,546,552 (“Difference extraction between two versions of data-tables containing intra-references”, published 2003), which is incorporated herein by reference in its entirety, discloses a method for generating a compact difference result between an old program and a new program. Each program includes reference entries that contain references that refer to other entries in the program. According to the method of U.S. Pat. No. 6,546,552, the old program is scanned and for each reference entry, the reference is replaced by a distinct label mark, whereby a modified old program is generated. In addition, according to U.S. Pat. No. 6,546,552, the new program is scanned and for each reference entry the reference is replaced by a distinct label mark, whereby a modified new program is generated. Thus, utilizing directly or indirectly the modified old program and modified new program, the difference result is generated.
WIPO Publication No. WO 2004/114130 (“Method and system for updating versions of content stored in a storage device”, published 2004), which is incorporated herein by reference in its entirety, discloses another system and method for generating a compact update package between an old version of content and a new version of content. The system of WIPO Publication No. WO 2004/114130 includes a conversion element generator for generating a conversion element associated with the old version and new version. It also includes a modified version generator for generating a modified version, and an update package generator for generating the compact update package. The compact update package includes the conversion element and a modified delta based on the modified version and the new version.
WIPO Publication No. WO 2005/003963 (“Method and system for updating versions of content stored in a storage device”, published 2005), which is incorporated herein by reference in its entirety, discloses a system and method for updating versions of content stored in a storage. The system of WIPO Publication No. WO 2005/003963 includes an update module for obtaining a conversion element and a small delta. It also includes a converted old items generator for generating converted old items by applying the conversion element to items of an old version, a data entries generator for generating data entries based on the modified data entries and on the converted old item, and a new version generator for generating a new version of content by applying the commands and the data entries to the old version.
It was noted before that a certain type of update package is sometimes referred to as a delta, however, this is non-limiting, and as it appears from WIPO Publication No. WO 2004/114130 and WIPO Publication No. WO 2005/003963, an update package may sometimes include a delta therewith, or as another example the update package may include the entire updated version.
Other methods exist in the art which take care of additional considerations involved in the update. Prior to elaborating on other methods these considerations should be pointed out.
It is appreciated that content is normally stored in a storage. A storage can include volatile memory, i.e., volatile storage (such as Random Access Memory RAM, etc.) and/or non-volatile memory, i.e., non-volatile storage (such as a hard disk, flash memory, EPROM (Erasable Programmable Read-Only Memory) and/or EEPROM (Electrically EPROM), etc).
There are storages that are organized in discrete areas, referred to, e.g., as blocks or sectors, wherein one block can include content belonging to more than one file. Hence, if there are, for example, two files stored in a storage, a single block can include several (‘x’) bytes belonging to a first of the two files, as well as several (‘y’) bytes belonging to a second of the two files. If the size of a block is ‘z’ bytes, it is clear that z>=x+y. Yet, those versed in the art would appreciate that writing content into a block affects other content stored therein. That is, if it is required to re-write the content stored in the x bytes of the first file (e.g., during update thereof), due to storage limitations it may be impossible to write only those x bytes, and it may be necessary to write the content of all the z bytes to the storage. This can be done, for example, by reading content stored in the z bytes from the non-volatile storage to a volatile storage not including blocks, such as RAM, updating only the content stored in the x bytes in the volatile storage (that is, the content of the other z-x bytes is left unaffected therein) and then writing the content of the z bytes back to the non-volatile storage. This limitation characterizes flash memory, for example, wherein it is required to completely delete the present content of a block, before new content (including updated content) can be written thereto, and hard disks where it is not obligatory to delete the complete sector before writing data thereto, but it is required to write the complete content of a block in one writing operation (e.g., it is impossible to write only x bytes when leaving the content stored in the z-x bytes unaffected. In order to leave the z-x bytes unaffected, it is required to store the content thereof in the volatile memory and write them back into the block, together with the x bytes). Hence, the update procedure may require many write operations to the storage including blocks, and it is appreciated that if it is desirable to achieve an efficient update, the update should better be optimized. For example, if x equals, for example, two bytes, than these two bytes should better be updated together, instead of updating the first byte and then the second byte, writing these two bytes separately into the block.
Furthermore, when in-place updating an original version (including original content) to an updated version (including updated content), there are sometimes update commands that use original content in order to generate updated content. For example, it is possible to copy original content from one place to a different place in the storage, wherein this copied content, in its destination place, forms part of the updated version. When copying content to a destination place it should be appreciated that this destination place could have been used before for storing other content (possibly also being part of the original version). Hence, the copied content can overwrite the original content at the destination place. Still further, it is possible that there is another update command that uses the destination place's original content in order to generate updated content. If this other update command is called further to operating in accordance with the first copy command, the destination place's original content can be already overwritten. This situation constitutes a “write before read conflict”. Herein below unless otherwise noted the term “conflict” is used for short for “write before read conflict”.
Write before read conflicts are a known problem in the art and U.S. Pat. No. 6,018,747 tries to cope therewith. U.S. Pat. No. 6,018,747 (“Method for generating and reconstructing in-place delta files”, published 2000), which is incorporated herein by reference in its entirety, discloses a method, apparatus, and article of manufacture for generating, transmitting, replicating, and rebuilding in-place reconstruct software updates to a file from a source computer to a target computer. U.S. Pat. No. 6,018,747 stores the first version of the file and the updates to the first version of the file in the memory of the source computer. The first version is also stored in the memory of the target computer. The updates are then transmitted from the memory of the source computer to the memory of the target computer. These updates are used at the target computer to build the second version of the file in-place.
According to U.S. Pat. No. 6,018,747, when a delta file attempts to read from a memory offset that has already been overwritten, this will result in an incorrect reconstruction since the prior version data has been overwritten. This is termed a write before read conflict. U.S. Pat. No. 6,018,747 teaches how to post-process a delta file in order to create a delta file, minimize the number of write before read conflicts, and then replace copy commands with add commands to eliminate conflicts, thus converting a delta file to an equivalent but larger delta file. A digraph is generated, for representing the write before read conflicts between copy commands. A schedule is generated that eliminates write before read conflicts by converting this digraph into an acyclic digraph.
Another known problem in the art is reliability of the update process, or fail safe update. This problem occurs, for example, when a process of updating an original version is interrupted before its normal termination, such as in a power failure. In such a case, there is a possibility that the content of the block which was being updated during the interruption may become corrupted and contain unexpected content.
It was already mentioned before that when in-place updating blocks of content, an original content of a block sometimes forms part of the input used by the update process. In such a case, if the original block (which is corrupted due to interruption) is required, the update process may be unable to resume. It can be impossible to re-update the corrupted block.
U.S. Pat. No. 6,832,373 (“System and method for updating and distributing information”, published 2004), which is incorporated herein by reference in its entirety, for example, tries to provide a fail safe update. It discloses devices, systems and methods for updating digital information sequences that are comprised by software, devices, and data. In addition, these digital information sequences may be stored and used in various forms, including, but not limited to files, memory locations, and/or embedded storage locations. Furthermore, the devices, systems, and methods described in U.S. Pat. No. 6,832,373 provide a developer skilled in the art with an ability to generate update information as needed and, additionally, allow users to proceed through a simplified update path, which is not error-prone, and according to U.S. Pat. No. 6,832,373's inventors, may be performed more quickly than through the use of technologies existing when U.S. Pat. No. 6,832,373 was filed.
That is, U.S. Pat. No. 6,832,373 describes using a backup block, while all block update operations are performed thereby using two phases ‘two-phase protocol’ or ‘two-phase commit’. According to U.S. Pat. No. 6,832,373, in a first phase of updating a block, the update process writes the updated content to the backup block and verifies that the content is correctly stored. In a second phase, the update process writes the updated content into its target block to form the updated content of the updated target block (thereby overwriting the original content of the target block). Yet, variations of the same method exist, such as copying the original content of the target block into the backup block in the first phase, and in the second phase in-place updating the target block to store the updated content.
The two phase commit (whether the backed up content is the original content or the updated content) can use only one additional backup block, yet, it is time consuming, since every write operation requires performing two operations (for the two phases). In addition, according to U.S. Pat. No. 6,832,373 every backup operation backs up the complete (original or updated) content of a block in the backup block, and hence if the number of blocks whose content is updated by the update process is n, the total number of operations required for the update process (including update operations and write operations into the backup block) cannot be smaller than 2n. If there are blocks in which content is stored in more than one write operation, the number of operations that the update process is required to perform will be even larger than 2n.
WIPO Publication No. WO 2007/023497 (“Method and system for in-place updating content stored in a storage device”, published 2007), which is incorporated herein by reference in its entirety, discloses a system and method for reliable in-place update, performing m block storage operations, including write operations and backup operations, wherein 2<=m<2n. WIPO Publication No. WO 2007/023497 protects before updating all the original content requiring protection, using a protection buffer (also known as a backup buffer) and the delta file. Thus, WIPO Publication No. WO 2007/023497 resolves write before read conflicts as well as maintaining reliable update.
Another known problem is the difficulty of reversing an in-place update in order to restore original content which has already been overwritten by updated content. U.S. Patent Publication No. 2006/0004756 tries to cope with this problem. U.S. Patent Publication No. 2006/0004756 (“Method and system for in-place updating content stored in a storage device”, published 2006), which is incorporated herein by reference in its entirety, describes a method and system for updating a stored version of content stored in a storage using an update package. The update package that includes update commands is adapted for updating an original version of content to an updated version. The updating is carried out in accordance with an update sequence. The method includes determining direction of the updating. If the direction is indicative of forward then the method forward-updates the stored version to the updated version in accordance with the update sequence. If the direction is indicative of roll-back, the method generates a roll-back update sequence opposite to the update sequence and rolls-back the stored version to the original version in accordance with the roll-back update sequence.
Typically in the prior art, not all operations unrelated to the update process which utilize the original or updated content are allowable during the update process. Therefore, there is a loss of efficiency because the duration of the update process constitutes downtime for operations unrelated to the update process.
There is a need in the art, thus, for efficient mechanisms for updating original content of an original version, generating an updated version.