Demand for computer disk storage has increased sharply in the last decade. Computer hard-disk technology and the resulting storage densities have grown rapidly. Despite application-program bloat, a substantial increase in web sites and their storage requirements, and wide use of large multimedia files, disk-drive storage densities have been able to keep up. Disk performance, however, has not been able to keep up. Access time and rotational speed of disks, key performance parameters in many applications, have only improved incrementally in the last 10 years.
Web sites on the Internet may store vast amounts of data, and large web server farms may host many web sites. Storage Area Networks (SANs) are widely used as a centralized data store. Another widespread storage technology is Network Attached Storage (NAS). These disk-based technologies are now widely deployed but consume substantial amounts of power and can become a central-resource bottleneck. The recent rise in energy costs makes further expansion of these disk-based server farms undesirable. Newer, lower-power technologies are desirable.
FIG. 1 highlights a prior-art bottleneck problem with a distributed web-based database server. A large number of users access data in database 16 through servers 12 on web 10. Web 10 can be the Internet, a local Intranet, or other network. As the number of users accessing database 16 increases, additional servers 12 may be added to handle the increased workload. However, database 16 is accessible only through database server 14. The many requests to read or write data in database 16 must funnel through database server 14, creating a bottleneck that can limit performance.
FIG. 2 highlights a coherency problem when a database is replicated to reduce bottlenecks. Replicating database 16 by creating a second database 16′ that is accessible through second database server 14′ can reduce the bottleneck problem by servicing read queries. However, a new coherency problem is created with any updates to the database. One user may write a data record on database 16, while a second user reads a copy of that same record on second database 16′. Does the second user read the old record or the new record? How does the copy of the record on second database 16′ get updated? Complex distributed database software or a sophisticated scalable clustered hardware platform is needed to ensure coherency of replicated data accessible by multiple servers.
Adding second database 16′ increases the power consumption, since a second set of disks must be rotated and cooled. Operating the motors to physically spin the hard disks and run fans and air conditioners to cool them requires a substantially large amount of power.
It has been estimated (by J. Koomey of Stanford University) that aggregate electricity use for servers doubled from 2000 to 2005 both in the U.S. and worldwide. Total power for servers and the required auxiliary infrastructure represented about 1.2% of total US electricity consumption in 2005. As the Internet and its data storage requirements seem to increase exponentially, these power costs will ominously increase.
Flash memory has replaced floppy disks for personal data transport. Many small key-chain flash devices are available that can each store a few GB of data. Flash storage may also be used for data backup and some other specialized applications. Flash memory uses much less power than rotating hard disks, but the different interfacing requirements of flash have limited its use in large server farms. The slow write time of flash memory complicates the coherency problem of distributed databases.
What is desired is a large storage system that uses flash memory rather than hard disks to reduce power consumption. A flash memory system with many nodes that acts as a global yet shared address space is desirable. A global, shared flash memory spread across many nodes that can coherently share objects is desirable.