The present invention is in the field of storing data to nonvolatile memory. In particular, the present invention comprises a method, apparatus, system, and machine-readable medium to pre-allocate a space for data.
Many devices today have data storage, such as cell phone, personal digital systems and computers. Data storage is typically divided into two categories, volatile data storage and nonvolatile data storage. Volatile data storage, such as random access memory (RAM) has the advantage of a fast access time. However, power is necessary to maintain data in volatile memory. Nonvolatile data storage, such as flash memory, does not require power to maintain the data. Nonvolatile memory, on the other hand, often has the disadvantage of a slower access time, particularly when writing data to or erasing data from the nonvolatile memory. For example, flash memory may take several cycles to write data. In a flash memory device, data can be maintained by storing a charge on a capacitance in a transistor via hot electron injection. The amount of charge on the capacitance in the transistor determines the data stored in that transistor. To store new data in a transistor in flash memory, the algorithms within the flash memory device must first erase the transistor by reducing the charge to a minimum threshold charge. The minimum threshold charge may represent the logical bits xe2x80x9c11xe2x80x9d, for instance. If logical bits xe2x80x9c11xe2x80x9d do not represent the data to be written into the transistor then an additional 100 cycles or more may be required to write the data. Unlike flash memory, the time involved with programming data into RAM may be based on the slew rate of changing the state of a transistor or a series of transistors. Changing the voltage or current supply to the base or gate of the transistor can change the transistor state.
As a result of the differences between nonvolatile memory and volatile memory, one may be more suitable than the other for a particular application. A cell phone, for example, may use both volatile memory and nonvolatile memory. Specifically, a cell phone will allow a user to store a phone number. If the phone number were stored in volatile memory, the number would be lost when the voltage or current of the battery is insufficient to maintain a transistor state or when the battery is replaced. Since it is more useful to the user to retain the number when the battery charge runs out or when the battery is replaced, phone numbers are typically stored in nonvolatile memory. However, it is undesirable to make a user wait over two hundred cycles for each digit the user enters for a phone number. Therefore, volatile memory may be used as a buffer to hold the telephone number while it is being stored in nonvolatile memory, allowing the user to enter the phone number quickly and use the phone for other purposes while the phone number is being stored.
Data may be organized in blocks to reduce hardware complexity and silicon cost, such as in flash memories. Within each block, data may be organized in a data structure that may comprise, for example, a section for block information, a section for header information, a section for unallocated space, and a section for data. The block information may comprise, for example, an offset indicating the end of the block. The header section can comprise one or more headers starting immediately after block information section. A header, also called a unit header, can be a data structure within the memory to describe a unit, e.g. one or more granules of memory for data. A granule may comprise a fixed size, movable data block and can be the smallest amount of memory manipulated by a memory manager. Generally, the space represented by a granule varies with the application for the memory. The header can comprise a fixed location data pointer and may include an offset indicating the start and an offset indicating the end of data associated with the header as well as a reference indicating ownership of the data by a file or directory. Further, a sequence of bits may indicate the boundary between the header section and the unallocated space section to allow the number of headers in the header section to vary. The sequence of bits may be immediately between the header section and the unallocated space section or may comprise a bit in each header.
The organization of data in blocks with headers has the advantage of saving significant amounts of time when erasing data. To erase data, a valid bit in each header can be modified to indicate that the data associated with that header is no longer valid. In this way, changing a single bit can effectively erase any amount of data. On the other hand, this process may leave invalid data between two units of valid data in a block, fragmenting the unallocated space. Thus, upon a request to store a large unit of data, the blocks may only have small units of free or dirty space. Free space is an area of memory that has been erased and is ready to be written to, whereas dirty space is an area of memory containing invalid data. So when writing the large block of data to the data storage device, the data can either be broken up into several small units (an area of memory of one or more granular sizes) or the small units may be consolidated into a single large contiguous unit of unallocated space. In many cases, it is more efficient for the memory device to consolidate the memory in a process called reclaiming memory. Reclaiming memory is a process of erasing dirty space, generally a block at a time, and may entail moving valid units of data to another block then moving them back in a substantially consecutive sequence. For example, each time a write is requested, an algorithm may check to see if sufficient free space is available to write the data. When insufficient free space is available, a procedure to reclaim memory may initiate. The reclamation process can reclaim as much memory as necessary or as possible in a contiguous unit. When the space is still insufficient, the write command may fail, resulting in a significant amount of wasted time by the agent waiting to see if the write can be performed.
Undoing and redoing the data write can compound that wasted time. For example, when a write is requested, the largest contiguous free space is typically chosen for storing the data to increase the efficiency of writing. As a result, a small unit of data may be written into a large unit of free space. The next write in sequence, however, may comprise a unit of data that is too large for any remaining unallocated space but can fit in the free space just used for the smaller unit of data. Now the smaller unit of data is moved into a smaller unit of free space, the large unit of space is reclaimed, and the larger unit of data is stored in the reclaimed space. This problem can compound even more when several smaller units of data are written into one or more larger units of unallocated space prior to writing a larger unit of data.