Storage devices are employed to store data that is accessed by computer systems. Examples of basic storage devices include, for example, volatile and non-volatile memory, floppy drives, hard disk drives, tape drives, and optical drives.
Disk drives contain at least one magnetic disk which rotates relative to a read/write head and which stores data nonvolatilely. Data to be stored on a magnetic disk is generally divided into a plurality of equal length data sectors. A typical data sector, for example, may contain 512 bytes of data. A disk drive is capable of performing a write operation and a read operation. During a write operation, the disk drive receives data from a host computer along with instructions to store the data to a specific location, or set of locations, on the magnetic disk. The disk drive then moves the read/write head to that location, or set of locations, and writes the received data. During a read operation, the disk drive receives instructions from a host computer to access data stored at a specific location, or set of locations, and to transfer that data to the host computer. The disk drive then moves the read/write head to that location, or set of locations, reads the data stored there, and transfers that data to the host.
Virtually all computer application programs rely on such storage devices which may be used to store computer code and data manipulated by the computer code. A typical computer system includes one or more host computers that execute such application programs and one or more storage systems that provide storage.
The host computers may access data by sending access requests to the one or more storage systems. Some storage systems require that the access requests identify units of data to be accessed using logical volume (“LUN”) and block addresses that define where the units of data are stored on the storage system. Such storage systems are known as “block I/O” storage systems. In some block I/O storage systems, the logical volumes presented by the storage system to the host correspond directly to physical storage devices (e.g., disk drives) on the storage system, so that the specification of a logical volume and block address specifies where the data is physically stored within the storage system. In other block I/O storage systems (referred to as intelligent storage systems), internal mapping technology may be employed so that the logical volumes presented by the storage system do not necessarily map in a one-to-one manner to physical storage devices within the storage system. Nevertheless, the specification of a logical volume and a block address used with an intelligent storage system specifies where associated content is logically stored within the storage system, and from the perspective of devices outside of the storage system (e.g., a host) is perceived as specifying where the data is physically stored.
Block I/O storage systems can be abstracted by utilizing a file system. A file system is a logical construct that translates physical blocks of storage on a storage device into logical files and directories. In this way, the file system aids in organizing content stored on a disk. For example, an application program having ten logically related blocks of content to store on disk may store the content in a single file in the file system. Thus, the application program may simply track the name and/or location of the file, rather than tracking the block addresses of each of the ten blocks on disk that store the content. In general, since file systems provide computer application programs with access to data stored on storage devices in a logical, coherent way, file systems hide the details of how data is stored on storage devices from application programs.
File systems can maintain several different types of files, including regular files and directory files. Files can be presented to application programs through directory files that form a tree-like hierarchy of files and subdirectories containing more files. Filenames are unique to directories but not to file system volumes. Application programs identify files by pathnames comprised of the filename and the names of all encompassing directories. The complete directory structure is called the file system namespace. For each file, file systems may maintain attributes such as ownership information, access privileges, access times, and modification times.
In contrast to block I/O storage systems and file systems, some storage systems receive and process access requests that identify a data unit or other content unit (also referenced to as an object) using an object identifier, rather than an address that specifies where the data unit is physically or logically stored in the storage system. Such storage systems are referred to as object-based storage systems. In object-based storage, a content unit may be identified (e.g., by host computers requesting access to the content unit) using its object identifier and the object identifier may be independent of both the physical and logical location(s) at which the content unit is stored. In some cases, however, the storage system may use the object identifier to inform where a content unit is stored in a storage system. From the perspective of the host computer or user accessing a content unit on an object-based system, the object identifier does not control where the content unit is logically or physically stored. Thus, if the physical or logical location at which the unit of content is stored changes, the identifier by which host computer(s) access the unit of content may remain the same. In contrast, in a block I/O storage system, if the location at which the unit of content is stored changes in a manner that impacts the logical volume and block address used to access it, any host computer accessing the unit of content must be made aware of the location change and then use the new location of the unit of content for future accesses.
One example of an object-based system is a content addressable storage (CAS) system. In a CAS system, the object identifiers that identify content units are content addresses. A content address is an identifier that is computed, at least in part, from at least a portion of the content of its corresponding unit of content. For example, a content address for a unit of content may be computed by hashing the unit of content and using the resulting hash value as the content address. Storage systems that identify content by a content address are referred to as content addressable storage (CAS) systems.
Data can also be stored and managed in a system by using database management systems (DBMSs). The relational approach to database management typically represents all information as “tables.” A “database” can be a collection of tables, each table having rows and columns. In a relational database, the rows of a table may represent records (collections of information about separate items) and the columns may represent fields (particular attributes of a record). In conducting searches, a relational database matches information from a field (column) in one table with information from a corresponding field (column) of another table to produce a third table that combines requested data from both tables.
Databases generally require a consistent structure, termed a schema, to organize and manage the information. In a relational database, the schema can consist of a collection of tables. Similarly, for each table, there is generally one schema to which it belongs. Once the schema is designed, the DBMS is used to build the database and to operate on data within the database. All database management systems can have mechanisms for building databases and operating on data in the database. One such mechanism involves specifying data retrieval operations, often called “queries,” to, for example, search the database and then retrieve and display the requested information.