1. Field of the Invention
The present invention relates generally to an apparatus for storing, and more particularly, to a secondary index structure of a Solid State Drive (SSD) for efficient storage of data.
2. Description of the Related Art
A Solid State Drive (SSD) is a storage medium which stores data. Though usage, appearance, and an installation method of the SSD are similar to those of a hard disk, the SSD has a built-in memory, and thus, enables more rapid reading or writing of data than a hard disk. In addition, there is no part in an SSD which physically rotates or moves as in a hard disk, no noise is generated and power consumption is less that for a hard disk. Thus, if the SSD is used for a portable computer, battery life increases. SSD products are divided into those that include a Random-Access Memory (RAM) as a storage medium and those that include a flash memory as a storage medium.
Recently, upon demand for big data and data analysis, a scale out key-value storage system which functions as a secondary index is desired. In addition, SSD is needed to increase performance and lower power consumption. A secondary index has an Input/Output (I/O) pattern which does not consider the features of the SSD, and therefore, the writing function is slower than for an SSD, and overhead is incurred due to garbage collection.
FIG. 1A illustrates a B-tree structure and FIG. 1B illustrates a secondary index using the B-tree structure.
Referring to FIGS. 1A and 1B, a secondary index using the B-tree sorts and maintains data according to keys, updates the sorted data when additional data is inserted and deleted, and records the updated data. In this scheme, when additional data is inserted or deleted, contents of the subject data area are corrected and then re-written. In FIG. 1B, when 2 and 5 are inserted into the table (1,4,7), the finally-sorted data table (1,2,4,5,7) is re-written.
However, the related art has a drawback in that re-writing of data occurs in a random writing format. The random writing makes it hard for the SSD to achieve maximum performance, and increases the garbage collection load.
FIG. 2A illustrates a Log-Structured Merge (LSM) tree which performs merge, sort, or compaction, and FIG. 2B illustrates a secondary index using the LSM tree.
A secondary index using the LSM tree sorts and maintains data according to keys, but when additional data is inserted and deleted, the secondary index independently records data as newly-sorted data. As illustrated in FIG. 2B, when additional data is inserted or deleted, data for an additional operation is separately sorted and stored.
In this technology, I/O traffic occurs with sequential writing, and thus, maximum performance may be achieved. However, as independently-sorted data increases, data which is respectively separated must be accessed for data search, and consequently, performance is reduced. When reading data, a Bloom filter is used to reduce I/O traffic, but there is a limit to the reduction, and if merge is performed, overhead is incurred accordingly.
Therefore, there is a need for a secondary index structure to reduce I/O traffic cost, reduce merge overhead, and improve space efficiency.