Enterprise class storage requires a highly reliable storage medium with very fast performance. The present solutions that exist today fall loosely into one of two areas: Hard disk drive (HDD) based platforms or solid state disk (SSD) based platforms. Both solutions accomplish the same effect, that of allowing users to access long term storage via some means of communication such as Ethernet. Where they differ however is their expense, performance, power requirements, and area requirements. The HDD based solutions are generally less expensive, have less performance, consumer more power and require more physical space. In contrast, the SSD based storage solutions are generally much more expensive, have much more performance, consume more power and require less physical space.
One major issue for either an SSD or HDD based enterprise class storage device is that the host, typically some form of Intel x86 class of computer must run a disk operating system (DOS) in order to be able to communicate between the application request for data (e.g. a file) and the actual physical contents on HDD(s) or SSD(s). In conventional HDD and SSD systems, the host controls everything that the drive does. Essentially the drive is “dumb” because it only knows how to respond for requests for a sector of data. As such, there is a centralized bottleneck that forces some number of HDDs or SSDs to be attached to a host so the host can convert between the file request and actual sector address of the data on the drives.
Though conventional enterprise class storage solutions are adequate for many applications, their limitations pose serious problems in applications where very large amounts of data must be stored and processed. For example, the PanSTARRS program run by the University of Hawaii will consist of four 2-meter telescope each of which will have a 1.4 gigapixel camera attached and will take a few hundred pictures of the sky every night looking for near earth objects—those which might conceivably hit the earth. To do this, PanSTARRS must process each 3 gigabyte image in a number of mathematical operations, reduce the data, and digitally look for the objects which have moved frame to frame. From this data an ephemeris is calculated which predicts the path of the found objects. The data processing and storage requirements for this are simply staggering. Something like 4.8 terra bytes of raw data are produced and must be processed and reduced.
Similarly, the Large Hadron Collider (LHC) is the worlds largest collider project with an underground ring that is 17 miles around. The LHC produces the worlds most energetic collisions between particles. The detector array senses approximately 300 Gb/sec of data, generating 27 terra bytes of raw data per day, which is placed in a repository, along with the reduced data set.
Both the PanSTAARs and LHC examples represent a new class of computational and storage requirements that are often referred to as terra-scale data sets or exacomputing. Today, these are typically effected by massive arrays of PCs tied together to form a network. This provides a large amount of computational horsepower at a reasonable cost, but with fairly large infrastructure, area, and cooling requirements overhead.
In all of these applications, (Enterprise class storage, terra-scale data sets or exacomputing), reliability is of vital importance. SSDs are based on flash memory technology, typically NAND flash. In order to achieve high reliability using flash memory technology high end wear leveling hardware and good hardware based spare sectoring technology are required. In addition, wear leveling algorithms, error coding and correction (ECC) algorithms and spare sectoring methodology are required to obtain the desired high reliability. These requirements can change over the life of the flash device, as well as over the life of the product which uses the flash device. SSDs are often built using hard coded Application Specific Integrated Circuit (ASIC) technology and thus cannot be changed later without completely replacing the hard coded ASIC's, which is expensive and time consuming.
In addition, as new flash memory device technologies come on the market, conventional systems that use hard coded ASIC's may not be able to utilize the new flash memory since the wear leveling algorithms, ECC algorithms and spare sectoring methodologies may not be compatible. In these instances, to utilize the new flash memory technologies, the hard coded ASIC's must be replaced, involving significant expense and effort.
Accordingly, there is a need for a method and apparatus for data storage using flash memory that can adapt to changing requirements of the flash memory. In addition, there is a need for a method and apparatus for data storage that will allow for easily changing wear leveling algorithms, ECC algorithms and spare sectoring methodology. Also, there is a need for a method and apparatus for data storage that will allow for easily implementing new flash memory technologies. Moreover, there is a need for a method and apparatus for data storage that will overcome the limitations of conventional data storage systems that include massive arrays of PCs tied together to form a network. Furthermore, there is a need for a method and apparatus for data storage that will reduce the large amount of physical volume that present day SSD and HDD solutions require.