A solid state drive (SSD) typically consists of one or a small set of flash memory controller devices and an array of non-volatile memory, which typically comprises NAND flash memory devices. The SSD also typically includes a volatile memory buffer, comprising Dynamic Random Access Memory (DRAM) memory devices. Each controller of the SSD may control its own array of NAND flash memory devices, arranged in groups, or memory channels, with a separate memory data/address/control bus for each channel.
Each NAND flash memory device of the SSD may consist of one or more semiconductor dies connected to a single memory bus interface on the device. The dies may operate independently; with the restriction that only one die (at a time) may be transferring data to/from the memory controller attached to the memory bus interface.
The operation of an individual non-volatile NAND flash memory device can be split into three distinct main functions: erasure, programming (writing) and reading. Before any writing to the device can occur, the memory cells must be first erased. Data storage in the cells is achieved by storing a charge on an insulated floating gate of a transistor. In order to write new data to a cell, any existing charge must be first removed, which is termed erasure of the cell. In NAND flash memory, cells are erased in large groups called erasure blocks.
Erasure requires a high negative voltage to be applied to the control gate in order to remove the charge from the floating gate (generally achieved by applying a positive voltage to the drain terminal of the transistor and grounding the control gate terminal) via a process known as quantum or Fowler-Nordheim tunneling, where the electrons are ‘pulled off’ the floating gate. The erase voltage must be applied for some considerable time and the fact that large blocks of memory cells are erased at the same time means that considerable power is consumed for this function.
Writing new data involves the reverse process, whereby applying a high voltage to the control gate and drain while holding the source terminal at 0V, electrons will migrate to the floating gate via a process of hot electron injection. This operation is known as ‘programming’ and the memory must be programmed in units of pages, where the page is a sub-unit of an erase block. To maintain accuracy for the amount of charge injected, especially for Multi-Level Cell (MLC) type NAND memory, programming may be performed in a series of incremental steps. Nevertheless, the page programming time is generally much less than the block erase time, and with the size of the page being much less than the erase block, the power consumed for programming is much less than for erasure.
Reading of cells that have been programmed is also performed on a page by page basis. However, the voltage which is applied to the control gate is less than that used for erasure and programming, as it is only required to apply a threshold voltage sufficient to determine if the transistor conducts or not, with no movement of charge to or from the floating gate. The reading time is therefore very much less than that required for writing (programming) and very much less power is consumed for this operation than either programming or erasure.
Referring to FIG. 1, a flash memory device will therefore consume power according to the power states of the individual dies which make up the device, where the states may be one of: power off 110, quiescent 120 (also known as the pre-charge state), program 140, erase 150, or read 130. Given that each of the dies may be in a different state, there are more than four different power states for the device as a whole (some combinations not being possible due to the device as a whole only supporting one read or write data transfer operation to be active at any time).
The flash memory devices are attached to a flash memory controller which may support several memory data/address/control buses (known as channels) where several memory devices may be attached to each channel. Only one data read/write transfer may be in progress at any time to one device on the channel, but the other devices on the channel may be conducting an operation, such as erase (that does not require a memory bus data transfer) at the same time, or operations that require a short data transfer followed by a longer period of no data transfer, such as program, may be interleaved or offset in time such that they may proceed in parallel.
Therefore, at any given time, there may be a read/write (program) data transfer operation in progress on each of the memory controller channels, plus several simultaneous or interleaved operations to multiple devices or dies. The power consumption of the flash memory array will therefore be heavily dependent on the read/write workload of the SSD as a whole, which will vary according to the rates of reading and writing, the size of the data being transferred with each read and write, and which locations are being read or written to.
The overall power consumption of an SSD will depend on the total of the individual power consumptions of the individual flash memory devices in the flash memory array, plus the power consumption of the flash controller, the DRAM buffer, and power losses in power supply and voltage regulation circuitry. This overall value will be largely dominated by the power consumed by the flash memory array, with the next largest contribution from the flash controller, and lastly the DRAM buffer.
The power consumed by the flash memory array is spread across many individual devices, so the power dissipation by individual flash memory devices is not normally a concern. However, the flash memory controller will be the largest single power dissipating device, and while a heat sink is not normally required for the controller, excessive power consumption may cause problems with dissipation of heat and the device may not be able to regulate its temperature under a defined maximum if subjected to a particularly demanding workload.
In the prior art, the host computer may be provided with a limited ability to implement power management. An example of a power management system is in the NVM Express (NVMe) 1.1b Specification, published by NVM Express, Inc. in July 2014, the contents of which are hereby incorporated by reference. The NVM Express permits a power manager associated with a host (e.g., host software) to modify an NVM express power state, as shown in FIG. 2. The host computer can dynamically modify the NVM Express SSD's power state to best satisfy changing power and performance objectives.
As shown in FIG. 2, the Host Computer 200 is connected to an NVMe SSD 250 and has a Host Power Manager application program 210 which takes a Power Objective input 220 and an input 235 which is the difference between a Performance Objective 230 and Performance Statistics 240 reported by the SSD controller 260. The Host Power Manager 210 then issues a command to set the Power State 245 of the SSD Controller 260.
The power states are discrete power states, and the number of power states implemented by a controller is returned in the Number of Power States Supported (NPSS) field in the Identify Controller data structure. A controller has to support at least one power state and may optionally support up to a total of 32 power states.
Associated with each power state are a number of parameters, as shown in Table 1:
TABLE 1RelativeRelativeRelativeRelativeMaximumEntryExitReadReadWriteWritePowerPowerLatencyLatencyThroughputLatencyThroughputLatencyState(MP)(ENTLAT)(EXLAT)(RRT)(RRL)(RWT)(RWL)025W5μs5μs0000118W5μs7μs0010218W5μs8μs1000315W20μs15μs2020410W20μs30μs113058W50μs50μs224065W20μs5000μs4351
The host can then dynamically change the power state according to the required performance and power objectives in relation to the measured performance statistics. This external power management is not designed to be a replacement for the autonomous power management conducted by the controller internal to the SSD. In particular, the NVMe Specification states that “[t]his power management mechanism is meant to complement and not replace autonomous power management performed by a controller.”
However, in the prior art, the SSD controller was typically provided with only an extremely limited capability to adjust the power consumption of the SSD. Methods to achieve this in the prior art have included inserting a delay between received workload commands from the host in order to reduce the rate of execution of the commands by the SSD. However, a problem with this technique is that there may be a maximum limit to the delay that can be inserted. For example, U.S. Pat. No. 8,862,807 describes a technique to calculate an idle time based on previous workload and insert the idle time between SSD operations. However, in addition to any inherent limitations is inserting a delay, the control technique is also comparatively primitive. Among other problems with this approach, the future workload behavior is based on previous workload behavior being collected such that the approach is not sufficiently flexible or agile enough to cope with continuously varying workloads or variations in workload over short term periods. Thus, any local autonomous control can be coarse and very limited in nature.