1. Field of the Invention
The present invention relates generally to computer file systems and, more particularly, to computer file systems that locate extents of a file stored on a storage device in a manner that is independent of the implemented operating system.
2. Related Art
An application server is a computer that executes application programs such as order entry systems, banking systems and employee databases. Typically, client computers or workstations, by which users interact with the application programs, are connected to the application server over a local area network (LAN), or a wide area network such as the Internet. In some cases, storage devices such as disks are directly connected to the application servers to store application programs and application data (hereinafter, collectively called “files”). These disks are referred to as local disks. In other cases, disk arrays (also commonly referred to as storage servers) are used to store the files. A disk array is a computer, separate from an application server, which is dedicated to storing files. Application servers are typically connected to disk arrays by a storage area network (SAN). Software executing in the application servers and the disk arrays make the disks of the disk arrays appear as though they are directly connected to the application servers from the perspective of the application programs.
Each application server and each disk array is under the control of an operating system, such as Windows NT, Sun Solaris or HP-UX. Each operating system stores files on disks and other storage devices using a “file system,” such as the HFS file system from Hewlett-Packard Company, the NTFS file system from Microsoft, and the Sun file system from Sun Microsystems, Inc.. A file system is a set of routines that allocates space on the storage devices and keeps track of storage-related information such as where each file is stored on the storage device, the name of each file, the folder or directory structure in which each file is organized, and the owner, access rights and other attributes of each file. A file system stores this storage-related information on the storage device. This storage-related information is commonly referred to as “file data structures,” “on-disk structures” and a “file structure” (collectively and generally referred to herein as a file data structure).
An operating system uses its file system to interpret this file data structure whenever an application program, or the operating system itself, reads from or writes to a file on the storage device. File systems from disparate operating systems are not compatible with each other, because file data structures created by one operating system's file system typically cannot be interpreted by another operating system's file system. Consequently, a file stored in accordance with one operating system typically cannot be read by a different operating system.
To enable recovery from catastrophic loss of data in case of hardware failure, sabotage, fire or other disaster, data centers routinely make backup copies of their files. These copies are typically made on removable media, such as magnetic tape or optical disk, and are then stored off-site.
Data centers typically back up files periodically. Oftentimes, backup operations are performed daily, although in some circumstances, backup operations are performed hourly or even continuously. Backup operations are becoming increasingly problematic due to the increase in computer resources consumed by such operations. Specifically, backup operations generate memory and computational demands on these application servers and disk arrays, reducing the computers' capability to execute application programs and/or quickly access files. Backup operations also consume network (LAN and SAN) resources; that is, they generate network traffic, which decreases the network's capacity to handle application-generated traffic between application servers and disk arrays. Furthermore, it is impractical to backup open files, because application programs that access these files are likely to change data in these files while the backup operation is in progress, rendering the backup copy internally inconsistent.
In an attempt to avoid these problems, data centers sometimes schedule backup operations for evenings, weekends, or other times that the application programs are not being utilized by many users. Oftentimes, during the backup operations, the application programs are shut down to prevent the data from being manipulated during the backup operation. However, this commonly used approach to backing up data is flawed, because it leaves the files vulnerable to data loss for long periods of time and during times of rapid change, that is, during times of peak usage. Furthermore, in some cases the time it takes to backup the files is significantly greater than the time during which the application program execution can be halted. In addition, each backup program is typically designed to run under only one operating system and can create backup copies of files stored under only that operating system, so a data center might have to employ several backup programs, one for each operating system, which increases costs to acquire the backup software and train data center personnel.