Increasing advances in computer technology (e.g., microprocessor speed, memory capacity, data transfer bandwidth, software functionality, and the like) have generally contributed to increased computer application in various industries. Ever more powerful server systems, which are often configured as an array of servers, are often provided to service requests originating from external sources such as the World Wide Web, for example.
Typically, a continuing problem in computer systems remains handling the growing amount of information or data available. The sheer amount of information being stored on disks or other media for databases in some form has been increasing dramatically. While files and disks were measured in thousands of bytes a few decades ago—at that time being millions of bytes (megabytes), followed by billions of bytes (gigabytes)—now databases of a million megabytes (terabytes) and even billions of megabytes are being created and employed in day-to-day activities.
Moreover, various forms of storage devices allow information to be held over relatively a long period without information degradation. A common storage medium is flash memory; specifically, flash memory is a non-volatile form of storage that retains information without drawing upon a constant source of power. Such type of memory is often employed in a variety of consumer electronic devices such as memory cards, universal serial bus (USB), flash drives, personal data assistants (PDAs), digital audio players, digital cameras, mobile phones, and so forth.
Another common type of non-volatile storage medium is a magnetic disk, which enables information to be recorded according to a magnetization pattern. Similar to other storage media, magnetic disks can be configured in a variety of manners (e.g., Magneto resistive Random Access Memory) as well as employed in many different applications. This type of storage device is commonly used in connection with databases and analog recordings. Likewise, volatile forms of storage exist that provide certain benefits that may also be accompanied by particular disadvantages. For example, retrieval times for volatile media are generally faster than that for non-volatile media, and many operations have increased uniformity due to well-established standards.
Moreover, today applications run on different tiers, in different service boundaries, and on different platforms (e.g. server, desktop, devices). For example, in a typical web application, many applications reside on a server supporting a large number of users; however, some client components of the application may run on desktops, mobile devices, and web browsers, and the like. Furthermore, advances in connectivity and cheap storage combined with the complexity of software management facilitate on-line services and software-as-a-service. In such services models, applications (and associated data) are typically hosted in central data centers (also sometimes referred to as the ‘cloud’) and are accessible and shared over the web.
The distributed applications require support for large number of users, high performance, throughput and response time. Such services orientation also requires the cost of service to be low, thereby requiring the scalability and performance at low cost.
A further challenge in implementing storage systems is support for distribution and heterogeneity of data and applications. Applications are composing (e.g. mashups) data and business logic from sources that can be local, federated, or cloud-based. Composite applications require aggregated data to be shaped in a form that is most suitable for the application. Data and logic sharing is also an important requirement in composite applications.
As explained earlier, data and applications can reside in different tiers with different semantics and access patterns. For example, data in back-end servers/clusters or in the cloud tends to be authoritative; data on the wire is message-oriented; data in the mid-tier is either cached data for performance or application session data; data on the devices could be local data or data cached from back-end sources. With the costs of memory going down, considerably large caches can be configured on the desktop and server machines. With the maturity of 64-bit hardware, 64-bit CPUs are becoming mainstream for client and server machines. True 64-bit architectures support 64-bit CPUs, data or address buses, virtual addressability and dramatically increase memory limits (to 234 bytes). Operating systems (e.g. Windows, Linux) are also upgraded to support and take advantage of 64 bit address-space and large memories. For example, desktops can be configured with 16 GB RAM, and servers can be configured with up to 2 TB of RAM. Large memory caches allow for data to be located close to the application, thereby providing significant performance benefits to applications. In addition, in a world where hundreds of gigabytes of storage is the norm, the ability to work with most data in memory (large caches) and easily shift from tables to trees to graphs of objects is the key to programmer productivity for next generation applications.