The present invention generally relates to memory media and technologies for use with computers and other processing apparatuses. The invention particularly relates to a solid-state mass storage device using non-volatile, solid-state memory components for permanent storage of data and methods suitable for promoting more efficient storage of data on such devices.
Non-volatile, solid-state memory technologies are widely used in a variety of applications, nonlimiting examples including universal serial bus (USB) drives, digital cameras, mobile phones, smart phones, tablet personal computers (PCs), memory cards, and solid-state drives (SSDs). Non-volatile, solid-state memory technologies used with computers and other processing apparatuses (referred to herein as host computer systems) are currently largely focused on NAND flash memory technologies, with other emerging non-volatile, solid-state memory technologies including phase change memory (PCM), resistive random access memory (RRAM), magnetoresistive random access memory (MRAM), ferromagnetic random access memory (FRAM), organic memories, and nanotechnology based storage media such as carbon nanofiber/nanotube-based substrates. These and other non-volatile, solid-state memory technologies will be collectively referred to herein as solid-state media or solid-state memory components. Mainly for cost reasons, at present the most common solid-state memory components used in SSDs are NAND flash memory components, commonly referred to as flash memory devices, flash memory components, flash-based memory devices, flash-based storage devices, flash-based media, or raw flash. As used herein, the term solid-state mass storage device refers to any device that uses non-volatile, solid-state memory components for permanent storage of data and has means for providing for interaction between a host computer system and the memory components. A nonlimiting example of a solid-state mass storage device as used herein is a solid-state drive having a host interface for communicating with a host computer system, a memory controller, and an array of non-volatile solid-state memory components accessible by the memory controller for storing data of the host computer system therein.
Briefly, flash memory components store information in an array of floating-gate transistors, referred to as memory cells. A memory cell of a NAND flash memory component has a top gate (TG) and a floating gate (FG), the latter being sandwiched between the top gate and the channel of the cell. The floating gate is separated from the channel by a layer of tunnel oxide. Data are stored in (written to or programmed to) a memory cell in the form of a charge on the floating gate which, in turn, defines the channel properties of the memory cell by either augmenting or opposing a charge on the top gate. This charge on the floating gate is achieved by applying a programming voltage to the top gate. Data are erased from a NAND flash cell by applying an erase voltage to the device substrate, which then pulls electrons from the floating gate. The charging (programming) of the floating gate is unidirectional, that is, programming can only inject electrons into the floating gate, but not release them. In general, each of the memory cells may be a single-level cell (SLC) or a multi-level cell (MLC). An SLC is a memory cell that stores one bit of information, and an MLC is a memory cell that stores multiple bits of information.
NAND flash memory cells are typically organized in what are commonly referred to as pages, which in turn are organized in what are referred to as blocks, memory blocks, erase blocks, or sectors. Each block is a predetermined section of the NAND flash memory component that comprises a plurality of pages, and each of the pages comprises a plurality of memory cells. A NAND flash memory component allows data to be stored and retrieved on a page-by-page basis and erased on a block-by-block basis. For example, erasing memory cells involves the application of a positive voltage to the device substrate, which does not allow isolation of individual memory cells or even pages, but must be done on a per block basis. As a result, the minimum erasable size is an entire block, and erasing must be done every time a memory cell is being re-written.
Once a page has been programmed, it may not be programmed again until the whole block in which it resides has been erased. When a flash memory component receives a program command to replace a page of current data with new data, the flash memory component typically stores the new data in a new page having an erased state, and it invalidates the current data in the old page. In other words, the flash memory component does not overwrite the current data at its current page location, but merely invalidates the current data and stores the new data in another page.
As the flash memory component continues to operate, invalid pages tend to accumulate in blocks that have not been recently erased. The accumulation of invalid pages generally reduces the amount of total usable storage space available in the flash memory component, and can also slow down the operation of the flash memory component. Accordingly, so-called garbage collection (GC) operations may be performed on blocks comprising undesirably large numbers of invalid pages in order to reclaim some of the storage space.
A typical garbage collection operation performed on an SSD is undertaken by its flash memory controller and involves moving any remaining valid data from a target block to a different block and then erasing the target block. Garbage collection operations are typically performed automatically by memory controllers as part of memory management performed by an SSD (or other solid-state mass storage device). As a result of the garbage collection operation, incoming commands (read and write) from a host computer system may be stalled, mainly due to the fact that erasure operations on a flash memory component take much longer to complete than read or write operations and no other operation may be started on a flash memory component until the erasure operation is completed. For an SSD, a single flash memory controller may be responsible for managing an array of many flash memory components, accessed via multiple physical memory bus lanes or channels, each channel being functionally coupled to multiple flash memory components. At any time while a garbage collection operation is in progress, individual flash memory components may be inaccessible while erasure operations are in progress and access to whole channels may be blocked while page data transfers are in progress. Therefore, the garbage collection operation, which involves copying valid pages to new locations and block erasure operations, consumes time and resources from the flash memory components and their memory controller, thereby reducing the overall performance of the SSD and hence reducing the Input/Output workload potential of the SSD.
The host computer system can assist the SSD's memory controller by informing the memory controller of non-valid data locations via a Trim command. The Trim command is designed to enable an operating system (OS) to notify the SSD which of the pages no longer contain valid data due to file deletions by the user or the operating system itself. Previously, with hard disk storage media, a file delete operation only resulted in file system sectors being marked as deleted in the sector map or metadata, without the data within the sectors themselves being deleted. With an SSD, a file delete operation results in the pages that these deleted sectors occupy remaining valid until eventually the sectors are overwritten with new data. As such, a garbage collection operation would be less likely to identify the blocks comprising these pages as candidates for consolidation since the pages are not marked as invalid, as they are yet to be overwritten and therefore still valid from the point of view of the SSD's memory controller.
The Trim command was introduced for SSDs to facilitate the early release of these pages into the pool of available space. After a file delete operation, the OS marks the file system sectors as free for new data as done conventionally but also sends a Trim command to the SSD to instruct the flash memory controller to mark the pages occupied by the sectors as not containing valid data. As such, the Trim command allows the SSD to free up valuable space much sooner than simply waiting for data sectors to be eventually overwritten, resulting in less write amplification with fewer writes to the flash memory component, higher write speed, and increased drive life.
Though providing the above-noted benefits, TRIM operations may only be done at a page granularity, in other words, TRIM operations do not address invalid data smaller than the size of a page (sub-page). Applications such as databases (including traditional relational and more recent NoSQL types) often use data objects or structures (hereinafter referred to individually or collectively as data structures) with a small size, e.g. of the order of 10s of bytes, such that a single page typically contains multiple data structures. Over an application's lifetime, data structures are continually being inserted and deleted, causing fragmentation across the storage media. In order to reduce the consumed capacity of data, the application performs a compaction process, removing deleted data and shrinking the actual space via defragmentation. The compaction process is performed irrespective of the storage media type. Notably, the two processes, compaction and garbage collection, are conventionally done individually without any coordination. Furthermore, the small sizes of the data structures prevents the application from informing the SSD regarding invalid data since the sizes are generally smaller than the Trim command granularity, that is, an individual page. Consequently, an SSD that contains data of a database application will contain pages that contain both valid and invalid data structures.
A similar concern was addressed in U.S. Pat. No. 8,037,112 to Nath et al. (Nath). While not intending to promote any particular interpretation, it appears that Nath discloses a process called “semantic compression” in order to prevent log entries from growing indefinitely over time. In semantic compression, log entries having opposite semantics are discarded during compaction. In addition to this compaction process, Nath discloses a log garbage collection component that may be used to reclaim space from dirty log entries. However, it is important to note that these two processes are different from garbage collection processes performed on an SSD. In particular, the semantic compression operates to compress or compact a list of log entries, each of which on their own represent valid entries, but which may be compressed in view of the presence of other entries in the log. The log garbage collection thereafter reclaims spaces from the compressed log entries. In contrast, garbage collection is performed on an SSD in order to remove invalid data from pages and thereby provide additional storage space. Although Nath mentions conventional garbage collection, it discloses that the garbage collection and log garbage collection processes are different processes performed by different components. As such, Nath's semantic compression and conventional garbage collection are entirely independent processes.
U.S. Patent Application Publication No. 2014/0365719 to Kuzmin et al. (Kuzmin) discloses a process of host-controller cooperation in managing NAND flash memory. While not intending to promote any particular interpretation, it appears that Kuzmin discloses a controller that maintains information for each erase unit which tracks memory usage. This information assists the host in making decisions about specific operations, for example, initiating garbage collection, space reclamation, wear leveling, or other operations. By redefining host-controller responsibilities in this manner, much of the overhead association with flash translation layer (FTL) functions can be substantially removed from the memory controller. However, while not intending to promote a particular interpretation, it appears that the host simply manages and schedules garbage collection within the storage device, but does not take an active role in identifying invalid data. Further, the issue of data structures that are smaller than the size of a page was not addressed by Kuzmin, that is, the host is not disclosed as analyzing data in a page and identifying sub-page data that is invalid.
In view of the above, it can be appreciated that there are certain problems, shortcomings or disadvantages associated with the prior art, and that it would be desirable if a system and method were available that allows for interaction between host computer systems and solid-state mass storage devices to improve garbage collection processes in the solid-state mass storage devices, particularly in terms of the ability to perform a garbage collection routine capable of addressing data structures smaller than the size of a page.