Increasing advances in computer technology (e.g., microprocessor speed, memory capacity, data transfer bandwidth, software functionality, and the like) have generally contributed to increased computer application in various industries. Ever more powerful server systems, which are often configured as an array of servers, are often provided to service requests originating from external sources such as the World Wide Web, for example.
Typically, a continuing problem in computer systems remains handling of the growing amount of information or data available. The sheer amount of information being stored on disks or other media for databases in some form has been increasing dramatically. While files and disks were measured in thousands of bytes a few decades ago, now databases of a million megabytes (terabytes) and even billions of megabytes are being created and employed in day-to-day activities.
Furthermore, today applications run on different tiers, in different service boundaries, and on different platforms (e.g. server, desktop, devices). For example, in a typical web application, many applications reside on a server supporting a large number of users; however, some client components of the application can run on desktops, mobile devices, and web browsers, and the like. In addition, advances in connectivity and cheap storage combined with the complexity of software management facilitate on-line services and software-as-a-service. In such services models, applications (and their data) are hosted in central data centers (e.g., referred to as the “cloud”) and are accessible and shared over the web.
The distributed applications require support for large number of users, high performance, throughput and response time. Such services orientation also requires the cost of service to be low, thereby requiring the scalability and performance at low cost.
A further challenge in implementing storage systems is support for distribution and heterogeneity of data and applications. Applications are composing (e.g. mashups) data and business logic from sources that can be local, federated, or cloud-based. Composite applications require aggregated data to be shaped in a form that is most suitable for the application. Data and logic sharing is also an important requirement in composite applications.
As explained earlier, data/applications can reside in different tiers with different semantics and access patterns. For example, data in back-end servers/clusters or in the cloud tends to be authoritative; data on the wire is message-oriented; data in the mid-tier is either cached data for performance or application session data; and data on the devices can be local data or data cached from back-end sources. With the costs of memory falling, considerably large caches can be configured on the desktop and server machines. With the maturity of 64-bit hardware, 64-bit CPUs are becoming mainstream for client and server machines. True 64-bit architectures support 64-bit CPUs, data or address buses, virtual addressability and dramatically increase memory limits (to 264 bytes). Operating systems (e.g. Windows, Linux) are also upgraded to support and take advantage of 64 bit address-space and large memories. For example, desktops can be configured with 16 GB RAM, and servers can be configured with up to 2 TB of RAM. Large memory caches allow for data to be located close to the application, thereby providing significant performance benefits to such applications. In addition, in a world where hundreds of gigabytes of storage is the norm, the ability to work with most data in memory (large caches), and readily shift such data from tables/trees to graphs of objects is the key to programmer productivity for next generation applications. Moreover, supplying replication capability to highly available store remains limited to specific type of storage mediums, and is not readily designed for generic use and optimized for substantially high latency requirements.