Storage devices, in general, have various methodologies for placing and allocating data onto the designated locations of the applicable storage medium, depending generally on the type of storage medium. As such, most storage media have an “address space” that may be used to associate a physical “storage space” on that media and which address space is presented to consumers of the data storage as the available storage. Depending on the storage media, there may be unique challenges and utilizations for managing the address space for the different types of media.
For example, a typical flash memory device may in fact have 1.2 TB of physical storage, but only 800 MB of address space available for use by a data consumer; while described below in greater detail, this is to manage a unique property of flash memory associated with the requirement that prior to updating data in a memory cell that already has data therein, the data must first be deleted (not overwritten) and any deletion happens with reduced granularity than is possible by writes. In other words, a complete erase block (which will be denoted as a “row” of memory cells in some examples) must be first deleted to write new data to any currently occupied/in-use memory cell in that erase block. When such data to be deleted or updated co-exists in an erase block with live data (which therefore cannot be erased), flash devices are presented with a problem which they have evolved in various ways to address.
In flash devices, a data address space is used to manage this flash-specific issue, particularly in cases when (1) other memory cells within an erase block have data that must be maintained and (2) an increased number of the available rows of memory have at least one cell with live data. In such cases, data must be reallocated to ensure that there remain available rows of memory cells which can be entirely deleted if necessary. The address space is configured to provide a data address for data stored in memory blocks that are originally associated with memory blocks in a row (i.e. erase block), such row also containing other live and possibly non-related data, with memory blocks in other rows that do not suffer the same limitation (meaning that data can be written thereto without deleting the entire row or that the remaining memory blocks in the row can be deleted without destroying other live data). In order to avoid the scenario in which there are no more rows that can be safely deleted, because all or nearly all rows contain at least one block with live data, the flash device may be configured to consistently re-associate the data in the data address space to preserve available rows; in some cases, the flash device is overprovisioned with memory relative to the address space; in some cases, the flash device is also overprovisioned with processing capability, for example with a dedicated high-speed processor to manage the allocation of the data; in some cases, all or a combination of some of these are implemented.
While the above issue relates to flash memory devices, similar translation layers may exist in other forms of data storage devices in which the physical locations, or addresses or other indicators of the actual location of storage locations within the storage media, may be abstracted by virtual addresses (which may be referred to herein as data addresses). For example, in spinning disk devices, the physical data locations are frequently defragmented or collocated within the same track on a disk, or moved to an outer track (upon which data retrieval may be faster), since data located in close proximity speeds performance in spinning disk media. In order to keep track of the data, a translation layer, such as a register that maintains a log or table keeping track of data addresses and physical location indicators is maintained. The data addresses exist in a data address space and, from the perspective of the physical storage media, are not currently managed in order to impact performance of the device.
From the perspective of the data consumer at any layer or abstraction (like for example, a user, a file system, or the forwarding tables on a network switch), all of this translation, and the underlying physical locations are not visible; the data consumer views the storage device as having a capacity that is equal in size to that of the data address space, and in fact may in cases be indistinguishable from such data address space. From the perspective of a data consumer, a flash device, for example, is a storage resource with a capacity that is the same as the address space.
Flash memory is becoming a widely used storage media due to its improved performance in some respects: fast access speeds, low-power, non-volatile, and rugged operation. Most flash devices comprise a flash translation layer (FTL) that generally comprises an FTL driver that works in conjunction with an existing operating system (or, in some embedded applications, as the operating system) to make linear flash memory, with erase blocks that are larger than individual write blocks, appear to the system like a single memory resource. It does that by doing a number of things. First, it creates “virtual” small blocks of data, or sectors, out of the flash's large erase blocks. Next, it manages data on the flash so that it appears to be “write in place” when in fact it is being stored in different spots in the flash. Finally, FTL manages the flash so there are clean/erased places to store data.
File systems, or other data consumers, such as, for example, operating systems on data consuming devices, such as DOS, typically use drivers that perform input and output in structured pieces called blocks. Block devices may include all disk drives and other mass-storage devices on the computer. FTL emulates a block device. The flash media appears as a contiguous array of storage blocks numbered from zero to one less than the total number of blocks. In the example of DOS interacting with a flash memory device, FTL acts as a translation layer between the native DOS BPB/FAT file system and flash. FTL remaps the data to the physical location at which the data is to be written. This allows the DOS file system to treat flash like any other block storage device and remains ignorant of flash device characteristics. FTL appears to simply take the data from the file system and write it at the specified location (sector). In reality, FTL places the data at a free or erased location on the flash media and notes the real location of the data. It may in some cases also invalidate the block that previously contained the block's data (if any), either in the FTL or in a separate data tracking module elsewhere on, or in data communication with, the FTL or the computing processor resources that is running the FTL. If the file system asks for previously written data, it requests the data at the specified data address and the FTL finds and reads back the proper data from the actual physical storage location. Flash media allows only two states: erased and non-erased. In the erase state, a byte may be either all ones (0xFF) or all zeroes (0x00) depending on the flash device. A given bit of data may only be written when the media is in an erase state. After it is written to, the bit is considered dirty and unusable. In order to return the bit to its erase state, a significantly larger block of flash called an Erase Zone (also known as an erase block) must be erased. Flash technology does not allow the toggling of individual bits or bytes from a non-erased state back to an erased state. FTL shields the file system from these details and remaps the data passed to it by writing to unused data areas in the flash media. This presents the illusion to DOS, or other data consumer, that a data block is simply overwritten when it is modified. In fact, the amended data has been written somewhere else on the media. FTL may, in some cases, also take care of reclaiming the discarded data blocks for reuse. Although there are many types and manufacturers of flash memory, the most common type is known as NOR flash. NOR flash, such as that available from Intel Corporation™, operates in the following fashion: Erased state is 1, programmed state is 0, a 0 cannot be changed back to a 1 except by an erase, and an erase must occur on a full erase block. For additional detail, the following document may be referenced: “Understanding the Flash Translation Layer (FTL) Specification”, Intel, 1998, and is incorporated herein fully by reference.
The emergence of commodity PCIe flash marks a remarkable shift in storage hardware, introducing a three-order-of-magnitude performance improvement over traditional mechanical disks in a single release cycle. PCIe flash provides a thousand times more random IOPS than mechanical disks (and 100 times more than SAS/SATA SSDs) at a fraction of the per-IOP cost and power consumption. However, its high per-capacity cost makes it unsuitable as a drop-in replacement for mechanical disks in all cases. As such, systems that have either or both the high performance of flash with the cheap capacity of magnetic disks in order to optimize these balancing concerns may become desired. In such systems, the question of how to arrange data across both such media, but also within such media is helpful in optimizing the requirements for wide sets of data. For example, in the time that a single request can be served from disk, thousands of requests can be served from flash. Worse still, IO dependencies on requests served from disk could potentially stall deep request pipelines, significantly impacting overall performance.
This background information is provided to reveal information believed by the applicant to be of possible relevance. No admission is necessarily intended, nor should be construed, that any of the preceding information constitutes prior art.