A conventional hard disk drive is equipped with several rotational disk platters and a small amount of dynamic random access memory (DRAM). The disk platters are magnetic media to store data, and the DRAM is used as a data buffer between the disk platters and a host operating system.
FIG. 1 shows a schematic diagram of a hybrid disk 100 coupled to a host operating system 101. The hybrid disk has a disk media 102, a DRAM 104, a non-volatile (NV) cache 106 and a disk controller 108. The NV cache 106 may be a non-volatile (NV) memory, such as a flash chip. The major difference between a conventional hard disk and a hybrid disk 100 is that, a hybrid disk integrates a non-volatile (NV) memory, such as a flash chip, into the disk drive. The disk media 102 usually has high capacity but low speed, while the NV cache 106 has low capacity but high speed. The NV cache 106 is used as a read/write cache to accelerate the data accesses on the disk 100.
To manage the NV cache 106, the hybrid disk 100 needs to set up metadata (index) structure inside the disk drive. To ensure performance, the cache metadata needs to be queried and updated very efficiently. One portion of the DRAM 104 inside the hybrid disk 100 is reserved to store the cache metadata 110. However, the size of the DRAM 104 is quite small, and most of it must be used as the disk buffer 112. Thus, the cache metadata size inside the DRAM 104 is limited. In addition, the more data in the DRAM 104, the more power the DRAM 104 consumes.
The disk media 104 may have a plurality of data blocks. Each data block on the disk media 104 can have a single sector or multiple consecutive sectors. If the data block has a single sector, the data block is represented by its Logical Block Address (LBA). If the data block has multiple consecutive sectors, the data block is represented by the LBA of the first sector.
The NV cache 106 may have a plurality of cache blocks. Each cache block in the NV cache 106 can be defined with the same block size as the one on the disk media 104 and is represented by its Cache Block Address (CBA). The cache metadata maintains the mapping between a LBA and a CBA (indicating that the LBA data block is cached in the CBA cache block), and also the status (e.g., CLEAN, DIRTY, or FREE) of each cache block. The status of each cache block may contain only one description for the whole block or each sector within the cache block may have its individual description.
A conventional cache management scheme uses the set associative hash table to store the cache metadata. The entire NV cache CBA space is divided into N sets, and each set has a plurality of blocks as shown in FIG. 2. Each LBA in the disk media is hashed into one of the sets of the NV cache using a hash function:target set=(LBA/block size/set size)mod(number of sets).
Within one set (e.g. set i of FIG. 2), the LBAs are stored linearly from the first entry (e.g. Block 0 of FIG. 2) to the last entry (e.g. Block 511 of FIG. 2) of the set. Therefore, to query if any LBA exists in the hash table, it needs to first compute the corresponding hash set for the LBA, and then search linearly from the beginning to the end of the set to check if the LBA exists in the set. Similarly, to store a LBA into the hash table, it needs to first compute the corresponding hash set for the LBA, and then search linearly within the set to find a free entry to store the LBA into it. The metadata search and update of the conventional cache management scheme is inefficient.
For a hybrid disk drive 100, the disk media size may be about 1 TB, the DRAM size may be about 16 MB, and the NV cache size may be about 8 GB. If the data block size is about 4 KB, the NV cache will have a total number of 8 GB/4 KB=221 cache blocks (CBAs). In the ATA standard, each disk block LBA is represented by 6 bytes. Thus, for each cache block (CBA), the corresponding entry in the hash set is represented by 6 bytes for the LBA and 2 bits for the status of the cache block. FIG. 3 illustrates the metadata table 300 of the NV cache under the conventional set associative hash scheme. As a result, the total size of the in-DRAM hash table 300 is about 12.5 MB, which is large as compared to the DRAM size of about 16 MB. Therefore, the conventional cache management scheme is impractical for the hybrid disks.