A block oriented storage device, such as a disk drive, may store a file in fixed size fragments known as blocks. Each block may have a physical block address that never changes, a logical block address that remains constant while the block is allocated but may change between allocations of the block, and a virtual block address that identifies the block across multiple storage devices.
A storage device has a finite storage capacity defined by a quantity of physical blocks and identified by a range of physical block addresses. A storage device also has a range of logical block addresses, which may exceed the physical capacity of the storage device. As such, a logical block address may or may not identify a physical block.
A materialized block is a block that contains actual data. An ordinary file is composed solely of materialized blocks. A non-materialized block is a block that has a logical block address but no corresponding physical storage. As such, non-materialized blocks may be allocated and used without consuming physical storage.
A sparse file may be composed of materialized blocks and/or non-materialized blocks. As such, the virtual size of a sparse file may exceed the physical capacity of the storage device on which it resides. Creation of a sparse file having only non-materialized blocks may be fast because no physical storage activity is needed, except perhaps for storage of some minimal metadata.
A storage system is composed of storage devices and manages an external interface for exchanging blocks between the storage devices and involved clients according to storage commands, such as read requests and write requests. This external interface may pass non-materialized blocks in the same way as materialized blocks.
Because all blocks of a storage device share a fixed size, the external interface sends or receives, for each block transferred, a large and fixed amount of data, such as 64 kilobytes. The external transfer of much data occurs, even if a block is not materialized. External transfer of a non-materialized block involves sending a full-sized data block that is filled with a bit pattern that is reserved to indicate non-materialization, such as with a repeating magic number.
Except for merely indicating that a block is non-materialized, the reserved bit pattern has no information content and is otherwise useless. Regardless of whether a non-materialized block is read or written, traditionally the whole 64 kilobytes filled with the reserved bit pattern must be sent, possibly over a computer network of limited capacity and prone to contention that erodes aggregate throughput.