1. Field of the Invention
The present invention relates to the computer field, and particularly, to a storage system and a method for realizing the storage system.
2. Description of the Related Art
The storage system is one of the most important components in a computer system. It is always desired that the storage system has high speed (high IOPS and high throughput), high reliability and low cost.
Most existing servers or PCs still use hard disk drives (HDD) as the major storage system. Although an HDD-based storage system has low cost, the improvement of HDD access latency in recent years lags far behind the improvement of its capacity. High latency will greatly influence the IOPS (I/O Operations Per Second) value. Especially for a working load where random I/Os dominate, e.g., a database working load, the HDD will become a performance bottleneck; while for a working load where sequential accesses dominate, the access performance of the HDD is better. For example, a current single HDD will get a sustained transmission rate of, up to 160 MB/s.
Compared with HDD, solid state disks (SSD) emerging in recent years have lower random access latency (0.03 ms of SSD vs. 5 ms of HDD), higher read IOPS (35000 of SSD vs. 200 of HDD). Although at first the capacities of SSDs are small, in recent years, they are gradually catching up with those of HDDs. For example, now there are some SSDs of 2TB as high-end products. The energy consumptions of SSDs are also lower than those of HDDs. However, SSDs have the following drawbacks: firstly, the costs of SSDs are far higher than those of HDDs (taking the current market prices as an example, $10/GB of SSD vs. $0.6/GB of HDD). In an SSD, the controller is extraordinarily expensive. Secondly, each cell of an SSD only has limited write cycles, which will impact its reliability. Although several techniques have been introduced to overcome the problem, they will increase the cost of the controller and reduce the IPOS and throughput. Thirdly, the read and write performances of SSD are unsymmetrical. The write latency of a SSD is 10 times larger than the read latency. This is because the write operation must first erase the whole block (each block is about 0.5-1 MB), and then write the original data in the block to the block together with the new data, which operation is very slow.
It is an important goal to make the storage system have high read and write performance, high reliability and low cost. Currently, there are several solutions as follows:
The first is to form a RAID (Redundant Array of Independent Disk) by HDDs, and distribute random reads and writes to several HDDs, so as to improve the read and write performance and reliability. However, this solution has high cost of energy consumption, and the data center rack occupies large space, and the performance improvement is limited.
The second is to utilize the lower read latency of SSDs, and in the meantime to alleviate the disadvantageous effects of random writes by using the merging of partially-occupied blocks and wear-leveling algorithm realized in the controller firmware, which includes: writing date in large continuous segments, merging the partially-occupied blocks at background and performing garbage collection on free blocks, and avoiding continually writing into some hot blocks. However, constant background file system 104 scanning to identify the partially-occupied blocks and free blocks will consume considerable bandwidth of the SSD controller. Moreover, it is difficult and costly for the firmware to realize a complex garbage collection algorithm. In addition, a single SSD is unable to provide high availability, while an SSD array is excessively expensive.
The third is to use SSDs as the cache of HDDs. This solution does not have the high reliability feature, and has the disadvantageous effects of writes, e.g., the write cycles are limited and the write latency is too long.
The fourth is to combine SSDs and HDDs, and perform manual data partition. For example, in GPFS (General Parallel File System), through manual distribution, metadata is stored in SSDs, while data is stored in HDDs. This solution will not only bring a great management burden, but also fail to realize high data read and write performance and high data availability.