Conventional computer files systems are used to create, edit, and store files. Typically, such conventional computer systems include a virtual storage or memory and a storage device, such as a disk. In order to access the files created, such conventional computer systems associate metadata with each file. The metadata for the file indicates the physical location of the file on the disk. For example, the metadata typically includes a map indicating the blocks on the disk in which the file is stored. The metadata may also indicate other attributes of the file and may be used for other purposes. During operation, the conventional system typically keeps a copy of a portion of the file and its metadata in the virtual storage in order to have fast access to the file when performing operations on the file.
When operations are performed on the files, the metadata associated with the files should also be updated on the storage device. For example, if additional information is written to the file or some information is deleted from the file the metadata on the disk should be updated to indicate that the file is stored on different blocks on the disk. However, updating the metadata stored on the disk may be very slow. This is because accessing the disk may be much more time consuming than accessing the virtual storage. Consequently, the metadata typically is changed only in virtual storage when the operation is performed. The changes to metadata are saved to disk, or hardened, at predetermined intervals. As a result, the metadata on the disk may be periodically updated without substantially slowing the performance of the conventional system.
When a file is removed, or deleted, several operations are typically performed. As with other operations, the metadata associated with a file is updated after removal of the file. In addition, the location on the disk where the file was stored are freed. This allows other data to be stored in the locations. Moreover, a directory which lists the file is updated to delete the file from the directory.
In one conventional computer system, updating metadata for removal of a file is treated the same as for other operations. Changes to the directory are treated in the same manner as changes to the metadata. Thus, the metadata for the removed file is changed in virtual storage and these changes in the metadata hardened to disk at the next predetermined interval. Similarly, the directory in virtual storage is rewritten without the removed file and these changes hardened to disk at the next predetermined interval. However, the locations in which the removed file was stored are freed immediately. In certain operating systems, such as UNIX, this is done because the operating systems require that locations for a removed file be immediately available for use by another data file. As a result, additional time is not taken to access the disk when the file is removed and the locations for the removed file can be used to store another file.
Although this conventional system functions in most cases, data integrity exposures may occur. A data exposure occurs when data for a file is inadvertently made accessible through a different file. It is possible to create a new file before the changes to the metadata for the removed file are hardened to disk. Because the locations in which the removed file was stored were freed upon removal, the new file could be written to those locations. The system may crash after the new file is written to these locations but before the changes to the metadata and the directory are hardened to disk. When the system is rebooted, the new file is written in the locations on the disk. However, because the changes to the metadata for the removed file were not hardened to disk, the metadata for the removed file will be present. This metadata indicates that data for the removed file can be found in the same locations, which now hold data for the new file. Because the metadata for the removed file was not changed, the data for the new file can be accessed by accessing the removed file. A data integrity exposure has, therefore, occurred. This is considered an illegal situation by certain operating systems, such as UNIX, and can result in undesirable occurrences. For example, users who are authorized to access the removed file but not the new file can view and edit the data for the new file residing in the locations.
Another conventional system prevents data integrity exposures by treating a removal of a file differently from other operations. Such conventional systems immediately harden changes to the metadata for a removed file. Thus, the data integrity exposure does not occur.
However, because each metadata is hardened for each removal, performance of the system suffers. For each removal, the disk is accessed to update the metadata for the removed file. As a result, performance of the system is slowed. This reduction in performance is particularly large when multiple files are removed concurrently. Multiple files may be removed by certain user-initiated commands or when an application clears a cache or temporary files. For each file removed, the disk is accessed to update the metadata. Consequently, as the number of files being removed grows, the loss in performance grows.
Accordingly, what is needed is a system and method for removing data files without data integrity exposures and without drastic reductions in performance. The present invention addresses such a need.