In the vast world of digital information storage, there is need to preserve certain digital files. The files are originally in digital format or are files that have been converted to digital format from original content. To achieve digital preservation of the files, there are means needed for long term error-free digital storage of the files, means needed for files to be retrieved error-free from the storage, and means needed for interpretation of retrieved files, for all the time span that the storage continues on. Files retrieved from the storage must be error-free, because the loss of one byte from a digital file in a critical place can cause the entire file to effectively be corrupt.
Digital storage systems have been developed for storing digital files for extended periods of time. These storage systems have been used, for instance, to store corporate documents and emails in compliance with the five year storage requirements of the Sarbanes-Oxley Act. However, there are no systems of prior art that will store digital files error-free for 100 years and more, with error-free retrieval of user-requested files from the storage, and with interpretation of the retrieved files, for all the time span that the storage continues on.
Digital storage media for utilization in the long term error-free digital storage of this invention are true Write-Once, Read-Many (WORM) media. True WORM media, once written-to, cannot be overwritten. True WORM media are invulnerable to corruption by hacker and virus attacks, which attacks can occur when the media are connected, to the outside of the storage, for the retrieval of user requested files.
In all digital storage media, even in media new from the factory, there is an error rate, expressed as, e.g., error bytes per 100,000 bytes. Media errors increase with time passing, where small defects present at manufacturing grow in size, and grow in number, and additional defects emerge. Digital storage media use error correcting codes, and the software for these codes, termed “firmware,” is incorporated as part of the drive for the media. A version of the Reed-Solomon (RS) error correction code is programmed into the firmware of the drive. The RS encoder takes a block of digital files that are being ingested and adds redundant bytes, to create an RS codeword that is stored on the media. On retrieval, the RS decoder processes each block and corrects media-induced errors so as to recover the original files. The firmware also incorporates software recovery retry algorithms that supplement the RS decoder in recovery of the original files on retrieval. Each RS codeword contains 255 codeword bytes, including bytes for error correction. The RS decoder, with added software recovery, can correct, in real time, 32 error bytes in one codeword. The employment of error correction codes allows for practical manufacturing of digital storage media.
The digital storage media industry's standard method for estimating the lifetime of digital storage media is an Arrhenius equation lifetime estimation model that uses data obtained from accelerated lifetime testing of the media (Ref. 1, p. 4). Samples of the media are tested at elevated temperatures that are beyond those experienced under normal usage, in order to accelerate the rate of growth of errors in the media. The result of an individual heating test is the lifetime, i.e., the hours-time-to-failure, of the sample medium under test at a given temperature. The tests are conducted at three different elevated temperatures, and the test results are entered on a plot of hours time vs. temperature, where the ordinate is a logarithmic scale of hours, a logarithmic-linear, or “log-lin” plot. An Arrhenius model plot of an estimated lifetime test of a Panasonic DVD media in shown in FIG. 1. The hours-time-to-failure of a medium under a heating test is determined as the hours of testing to the point when the error rate of the individual medium under test is observed to increase to a specified multiple of the initial error rate of the medium. For example, as stated in a note alongside the ordinate in FIG. 1, the “Archival Lifetime” point (hours-time-to-failure) occurs when the error rate of the medium under test is observed to increase to twice the initial error rate.
In FIG. 1, there are plotted three test data points, for the hours-time-to-failure, of media lifetime tests done at 90° C., 80° C., and 70° C., and the temperature of normal usage is shown as 30° C. For discussion purposes of FIG. 1, the hours-time-to-failure for each test data point has been scaled from the ordinate of the logarithmic scale of hours, as follows, using the 90° C. test data point as an example, as follows: the ordinate of the test data point is scaled from the ordinate as being 0.38 of the interval between 100 hours and 1,000 hours. The logarithm of the hours-time-to failure of the test is formed of characteristic “2” (between 100 to 999 hours) and mantissa “0.38”=2.38, the anti-logarithm of which is 240, the number of hours-time-to-failure of the medium under test. For the 80° C. test, the hours-time-to-failure of the medium under test is 870, and for the 70° C. test, the hours-time-to-failure of the medium under test is 1445. It is to be noted that as the test conditions become less harsh, i.e., the testing temperature is lower, the hours-time-to-failure of the medium under test increases.
It is the indicia of the media estimated lifetime testing by the Arrhenius model that the failure mechanisms of the media remain the same for all testing temperatures, and following on, then, a constant slope will be obtained for a straight line that is drawn through all the test data points. In FIG. 1, the straight line, as drawn by the Arrhenius testing facility for Panasonic, proceeds through the 90° C., 240 hours test data point and the 80° C., 870 hours test point. In drawing the straight line, the 70° C., 1,445 hours test data point was ignored. The straight line was then extended, beyond the 80° C. data test point, to intersect a vertical line drawn by the Arrhenius testing facility for Panasonic, a line that that indicates the 30° C. normal usage temperature point of the abscissa scale.
Extension of a line or curve into the future, beyond the last test data point, done by assuming the variables will continue to behave as they have in the past, is known as extrapolation. In FIG. 1, the far reach of the extrapolated line intersects the 30° C. temperature line at an ordinate scale reading of about 788,000 hours-time-to-failure, corresponding to about 90 years estimated lifetime. Thus the straight line was extrapolated in time, beyond the 80° C. data test point, the through 90 years of no test data. (A notation to the 90 years, 30° C. temperature point has been here added, for purposes of clarity, onto the Panasonic plot.) In FIG. 1, the Arrhenius model estimated lifetime of the Panasonic DVD is claimed to be 90 years, when the DVD is operated always at 30° C.
At media testing temperatures that are closer to the normal usage temperatures of the media, there may be failure mechanisms in the media that are different from those prevailing at higher testing temperatures. These different temperature-dependent failure mechanisms may produce hours-time-to-failure observed results that are shorter than the expected hours-time-to-failure failure results. This is evidenced in FIG. 1 where the 70° C. test data point is far below, in time, the extrapolated line drawn through the 90° C. and 80° C. test data points. The 70° C. data test point should not have been ignored, as the 70° C. test was the longest of the three tests, and it was the test of the media conducted at the temperature closest to the normal usage temperature of the media. A revised extrapolative line can be approximately fitted, with aid of a transparent straight-edge to all three test data points of FIG. 1, and that revised line would intersect below the 30° C. abscissa at about the location where a notation of an asterisk and dash have been here added onto the Panasonic plot. This location corresponds to a reading on the logarithmic ordinate scale of about 14 years estimated lifetime, a result that is many decades shorter estimated lifetime for the DVD. These huge shifts in estimated lifetime are concomitant with use of the Arrhenius model for estimated lifetime testing, which model incorporates extreme extrapolations in time from the last test data point, in conjunction with the plotting of hours-time-to-failure test data points against a logarithmic coordinate scale that is a highly compressive scale of time.
In the Arrhenius model for lifetime estimation of media, the hours-time-to-failure tests of the media are conducted within a period of from 3 to 12 months. To derive a lifetime estimation for the media under test, by extrapolating a straight line, the beginning of which line was drawn though a set of data test points that were gathered in a relatively short time period, on through time spans of no test data, can be inaccurate. It would be uneconomical of utilization of storage media and media equipment, and unsafe of retrieval of error-free files from the media, to base long term error-free digital storage solely on media lifetime claims that are derived from Arrhenius model extrapolative testing.
Means for the interpretation of user-requested retrieved files are required when, in the future, the programs and operating systems that originally were needed for interpretation of the retrieved files have become obsolescent. There are efforts for interpretation of files retrieved from long term digital storage that use virtual computers and emulation. One such effort utilizes a Universal Virtual Computer (Ref. 2). Another effort that resulted in a working method for interpretation is the technique of Migration-on-Request (MoR). MoR was developed at Leeds University, UK (References 3, 4). The Migration Tool (MT) of the MoR decodes and transforms retrieved files to usable formats, allowing the user to reuse the digital objects, to repurpose them, and to use them elsewhere. Repurposing is exploiting the retrieved files in new ways, e.g., editing, correcting, extracting from them.
MoR interpretation requires that the files of the long term error-free storage remain in the format as originally ingested (Ref. 3, Section 1) (Ref. 4, p. 1, p. 4). Migration of files means copying the files while changing the format of the files. In contrast, forward copy of files, wherein the term “forward copy” is a new term created for the purposes of this patent, means that files are copied, unchanged, from one media to another media. It is important to note that migration of files, and forward copy of files, are entirely different operations. Each sequential migration of files introduces losses in the file, losses that are accumulatively propagated forward in time (Ref. 4, pp. 2-3). In the MoR approach, files are singly migrated, upon user-request for files, always from the original ingestion format, to the future, then-current, usable format, by the then-current MT. Thus the MoR approach prevents migration losses from being propagated forward, because with MoR there is never more than a single migration.