The present disclosure relates to computer technology, and more specifically, to a method, apparatus, and computer program product for data storage control.
In a traditional hard disk drive (HDD), data logical addresses and physical addresses can have a one-to-one correspondence. In the present disclosure, the logical address can refer to an address processed by an operating system, while the physical address can refer to a physical location on a storage medium. The physical address can be transparent to the operating system, and a controller of HDD can be responsible for correspondence between the logical address and the physical address. Where HDDs are used, each time the operating system writes data to the same logical address, the data may be written to the same location on the storage medium. For data deletion, the operating system will generally mark a logical address corresponding to data pending deletion as free, so that the logical address may be used by a subsequent write operation. Before the subsequent write operation actually happens, the data pending deletion is not erased from the storage medium and thus can be read and recovered by technical means.
For some sensitive data with high security requirements, the operating system's deletion operation may be a “secure deletion”, i.e., the data is completely erased from the storage medium. The operating system may execute the subsequent write operation immediately after marking the logical address as free, e.g., writing pseudo-data to the logical address. Since logical addresses may have a one-to-one correspondence to physical addresses, at a physical address where the sensitive data is stored the data pending deletion is overwritten by the pseudo-data and thus may not be read or recovered.
Compared with traditional hard disk drives (HDDs), solid-state drives (SSDs) based on flash technology may have a performance (speed) advantage. Unlike HDDs, in SSD, logical addresses may not have a one-to-one correspondence to physical addresses. If there are consecutive two write operations of data to the same logical address, then the data may be stored at different physical addresses. As a result, where SSDs are used, the operating system may not erase sensitive data from the storage medium by writing pseudo-data to a logical address where the sensitive data is stored.
Specifically, in SSDs, when a first write operation is performed to a logical address so as to write a first data element, the SSD may allocate a first physical address to the logical address for storing the first data element, so that the logical address corresponds to the first physical address. When a second write operation is performed to the same logical address to write a second data element, the SSD can allocate a second physical address to the logical address for storing the second data, rather than erasing the first data element from the first physical address before writing the second data element to the first physical address. Thus, the physical address mapping to the logical address may change from the first physical address to the second physical address. A reason behind this is in SSD, erase operations to the storage medium may be implemented in a unit of storage medium block. The size of the storage medium block is typically 220 bits. On the other hand, the operating system's data operations are typically in a unit of 29 (512) bits. If, for example, the operating system writes data of 512 bytes to the same logical address twice consecutively, then the involved data is merely a very small part of data in one storage medium block. In order to modify this small part of data, it may be impractical to erase data of the whole storage medium block.
Therefore, in the case of using an SSD, even if the operating system writes pseudo-data to a certain logical address immediately after deleting sensitive data stored at the logical address, the operating system may not erase the sensitive data from SSD. The pseudo-data will be stored at a different physical address from the sensitive data, while the sensitive data is still stored at the original physical address. As a result, the sensitive data can be read and recovered, by technical means, from the physical address where the sensitive data is stored.
Therefore, there is a need for a new solution to enhance the security of SSD storage devices.