1. Technical Field
The present invention relates generally to an improved data processing system and in particular to a method and apparatus for storing data. Still more particularly, the present invention relates to a method, apparatus, and computer instructions for storing data in a file system.
2. Description of Related Art
Data is stored in a data processing system using a file system. The file system is a mechanism used to store and retrieve files on a storage device, such as a disc. The file system defines the directory structure for keeping track of files and the path's syntax required to access those files.
Further, a file system also defines the way that files are named as well as the maximum size of a file or volume. Examples of file systems are a journal file system (JFS) and NT file system (NTFS). File systems may divide a hard disc into small units called blocks. A block is the smallest unit of storage that may be allocated. Each block in a file system is either in an allocated state or a free state. The block size may differ depending on the particular implementation. Block sizes may be, for example, 512 bytes, 1024 bytes, 2048 bytes, 4096 bytes, or even in some cases 64 K bytes. Most modern file systems support these types of block sizes. The selection of the block size is performed at the time in which the partition for an operating system is formatted.
As machine architectures, such as processor architectures, grow beyond 32 bytes or 64 bytes, the block size that may be supported by a file system increases. For example, with a 32 byte architecture in an Intel processor from Intel Corporation, handling of input/output is efficient using 4 K pages. A 64 byte architecture is efficient with 4 K and greater size pages. When file systems increase block sizes, which are supported, disc space may be wasted by the storage of files that are not divisible by the block size.
The present invention recognizes that file systems with many small files result in a large amount of wasted disc space. For example, with a block size of 512 bytes, storing a 1 byte file will result in 511 bytes being wasted. In another example, a file having a size of 513 bytes would result in one block being filled with data, while a second block only contains 1 byte of total data. As a result, 511 bytes of space are wasted. As the block size grows and the number of small files grows, the amount of disc space wasted increases.
Therefore, it would be advantageous to have an improved method, apparatus, and computer instructions for managing the storage of data in file systems.