The present invention relates to the field of storage techniques generally, and in particular, to the field of storage of very large amounts of data
In many applications there is a need to store extremely large amounts of data containing, for example, gigabits (one billion bits) or terabits (one trillion bits). This need could arise when maintaining archives or storing image data. Technological advancements in the magnetic memories and optical disks now provide media and drive systems upon which such large amounts of data can be stored Techniques for storing such large amounts of data easily, however, have generally not kept pace with technological advancements in the storage devices.
A recent article addressing the problems of mass storage systems discusses several different techniques for storing large amounts of data. In that article, entitled "Toward a Reference Model of Mass Storage Systems," by Steven W. Miller and M. William Collins, Computer, vol. 18, No. 7, pp. 9-22 (July 1985), the authors describe several prior art approaches to mass storage and discuss the reasons why those approaches have led a declining vendor interest in mass storage systems.
The authors define mass storage systems (also called "MSS") as secure places to back up or retain files which are capable of serving multiple dissimilar host processors. To counter the declining interest in such systems, the authors propose a reference model MSS. Perhaps the major design goal of the authors was a system which could store named bitstreams free of constraints imposed by the size or structure of the file management systems of the computer generating the bitstreams. To meet that and other design goals, the authors determined that the optimum method of storage for mass storage systems would be a single bit file or bitstream with no internal file structure. The reasons for selecting such a bitstream are explained in detail in the article as are the methods of implementing the model.
The mass storage industry has generally followed the recommendations in the article, and the recent trend is toward to use of bitstreams with no internal structure, rather than file management systems. The use of unformatted bitstreams, however, has laid bare their limitations. Although unformatted bitstreams do provide a degree of computer system autonomy, they also contain several inherent disadvantages. One major disadvantage is data loss from errors. In the reference model, any uncorrectable error in the bitstream renders the entire bitstream unusable. Loss of data from errors is an even bigger problem for long term storage, for example, on optical disks. Optical disks can provide data storage for several years, and the chances of data loss are higher than magnetic disks.
Furthermore, in order to use a portion of the bitstream, a host computer system must read the entire bitstream into its memory. For large bitstreams, host computer systems must devote large portions of their memory for this task even to access only a small portion of the bit file.
The disadvantages of this proposed reference model increase as the size of the bitstreams increases. Such a size increase is inevitable in most applications.
It is therefore an object of the present invention to provide a method for formatting large quantities of data which allows the data to be transported easily between different host computers, but which reduces the amount of data lost due to error.
It is a further object of the invention to provide higher data integrity for storing large quantities of data.
It is yet another object of the present invention to provide a method for storing large quantities of data in a bitstream which allows recovery of valid portions of the bitstream when only part of the string has been corrupted.
Additional objects and advantages of the present invention will be set forth in part in the description which follows and in part will obvious from that description or may learned by practice of the invention. The objects and advantages of the invention may be realized and obtain by the methods and apparatus particularly pointed out in the appended claims.