The present invention relates generally to backup of data from computer non-volatile storage and more particularly to backup of data by a program without access to the filing system associated with the computer non-volatile storage.
Personal computer systems are well known in the art. Personal computer systems in general, and IBM Personal Computers in particular, have attained widespread use for providing computer power to many segments of today""s modern society. These systems are designed primarily to give independent computer power to a single user and are inexpensively priced for purchase by individuals or small businesses. Personal computers can typically be defined as desktop, floor standing, or portable computers that consist of a system unit having a single central processing unit (CPU) and associated volatile and non-volatile memory, including RAM and BIOS ROM.
Conventionally, data backup and restore programs fall into two categories, those that are file-system dependent and those that are file-system independent. In a file-system dependent backup, the backup program, with access to the file system, copies one file at a time to a backup medium. The program collects together the pieces of each of the files, which may not be contiguously stored on the disk, and stores the entirety of each file in one contiguous portion on the backup medium. In a file-system independent backup, the backup program copies the complete contents of the physical medium regardless of whether or how it is used by the file system. The complete contents of the medium to be backed up is read sequentially.
A problem with the file-system dependent method is that the backup program must have access to a driver program in order to interpret the file system structure, for each and every file system for which it is expected to back up files from. It has to know all the file systems that it will have to backup files from, including any security or other access restrictions, and in some cases, such as Microsoft""s NTFS, have access to proprietary and unpublished specifications. For this reason, file-system dependent backup processes are generally only written to run under a xe2x80x98productionxe2x80x99 operating system that has the correct file access built in. If the medium being backed up is the boot drive of a system, it cannot be restored in the event of failure unless an alternative boot source for the same operating system is available. So the file-system independent method is more generally used for boot drive backups.
A problem with the file-system independent method is that it has no way of knowing which sectors of the physical medium are unused by the file system, so it must back up and restore the whole of the medium. This is slow and wasteful of backup space if the original medium is only partly used, as is the normal case.
It is possible to determine cluster usage in order to identify portions of the original medium which are not used, but this requires operating system file knowledge.
The Drive Image product from Powerquest Inc. does perform a back up of the major operating systems. However, this product relies on knowledge of the file system and requires modifications to support any new file systems.
It would be advantageous if data could be backed up without access to the filing system, but in such a way that areas of the original medium not used by the filing system were not backed up.
It is well known that a complete operating system and required applications can be quickly installed on a personal computer by copying the hard drive partition sector-by-sector from a known good installed system to a new system. This process, known as xe2x80x98cloningxe2x80x99, is independent of the specific operating system or file system in use (the xe2x80x98targetxe2x80x99 operating system), so can be carried out by a program running under a different operating system (a xe2x80x98servicexe2x80x99 operating system). For example Microsoft Windows NT or Novell Netware can be copied by a program running under DOS booted from a network or diskette, even though DOS can not access the individual files.
A problem arises when it is necessary to personalise some files before the target operating system boots. Typically, some parameters such as the system name or network address must be changed to make the system unique on the network. Since DOS cannot access the file system, it cannot modify the specific files required to personalise the system.
Accordingly, the present invention provides a method for backing up data stored using a filing system on a computer non-volatile storage device, the method comprising the steps of: writing, using the filing system, pre-defined first signature data to substantially all of the unused portion of the computer non-volatile storage device; backing up, independent of the filing system, the data stored on the computer non-volatile storage device, data consisting of the pre-defined first signature not being backed up.
The present invention solves the problems of the prior art by splitting the backup process into two phases, a file-system dependent data preparation phase and a file-system independent binary image backup and restore process. The advantage of this is that the space taken by the backup is much less than if every sector is backed up, and the time taken to restore is much less. The time taken to perform both phases of backup is not much different from a complete backup of the medium.
Space saving can be achieved in the prior art by compression only, but without the preparation phase the unused areas of the medium can be expected to contain old data that is not particularly compressible, so the space saving will be less.
In a preferred embodiment, the writing step comprises the steps of: creating a file on the computer non-volatile storage device; writing pre-defined first signature data until the unused portion of the computer non-volatile storage device is full; closing the file; and deleting the file.
Deletion of the file does not delete the data, it merely marks the space used by the data as available for reuse. Unless updates have been made to files or new files added to the computer system non-volatile storage, then the data contained within the area previously occupied by the deleted file will still be present.
Preferably, the step of backing up includes a step of compression of the data to be backed up.
In a variation of the preferred embodiment, data consisting of the pre-defined first signature is backed up; and the pre-defined first signature and the compression algorithm are chosen such that the pre-defined first signature compresses to a high degree.
The use of the pre-defined signature means that unused areas of the non-volatile storage medium can be compressed to a higher degree than is possible with the prior art.
In another embodiment, the writing of the pre-defined first signature data is done prior to installation of the operating system.
This has the advantage that ares of the non-volatile storage medium claimed by the operating system for use as, for example, a swap file, but not actually used by the operating system, are not backed up.
In a further embodiment, the method further comprises the steps, prior to said backing up step, of: identifying, using the filing system, files on a first computer which need to be personalised for a particular computer; backing up, using the filing system, the files which need to be personalised for a particular computer; personalising the backed up copy of the files which need to be personalised for a particular computer such that the personalisation is for another computer; writing, using the filing system, of second pre-defined signature data to the files on the first system which need to be personalised; and further comprising the steps, after said backing up step, of: scanning each portion of the backed up data for the presence of the second pre-defined signature; responsive to the second pre-defined signature not being found in a portion, restoring that portion of the backed up data to a second computer; and responsive to the second pre-defined signature being found in a portion, restoring the previously personalised files to that portion of the second computer.
The invention also provides a data processing system comprising: a non-volatile storage device; a filing system associated with the non-volatile storage device; means for writing, using the filing system, pre-defined first signature data to substantially all of the unused portion of the computer non-volatile storage device; means for backing up, independent of the filing system, the data stored on the computer non-volatile storage device, data consisting of the pre-defined first signature not being backed up.
The invention further provides a computer program product for use in a data processing system having a non-volatile storage medium, the computer program product comprising: a computer usable medium having computer readable program code means embodied in said medium for backing up data stored using a filing system, said computer program product having: computer readable program code means for writing, using the filing system, pre-defined first signature data to substantially all of the unused portion of the computer non-volatile storage device; and computer readable program code means for backing up, independent of the filing system, the data stored on the computer non-volatile storage device, data consisting of the pre-defined first signature not being backed up.