The present invention relates to data storage and, in particular, to the distribution of data storage over a computer network.
A conventional network computer system is comprised of a number of computers that each have an operating system, a network for communicating data between the computers, and at least one data storage device that is attached to at least one of the computers but not directly attached to the network. In such a system, the transfer of data between the data storage device and a computer in the system other than the computer with which the device is associated requires that the operating system of the computer with which the data storage device is associated to devote a certain amount of time to the processing of the data transfer. Because the operating system of the computer is typically servicing requests from various applications (e.g., a word processing application) executing on the computer, the operating system typically is only able to devote a limited amount of time to the processing of the data transfer.
While data transfer rates over networks were relatively slow, the operating systems were typically able to service data transfer requests quickly enough to utilize any available time on the network for data transfers between computers in the system. In other words, the networks, due to their relatively low transfer rates, were the bottleneck in transferring data between a data storage device associated with one computer in the system and other computers in the system. However, as the data transfer rates for network improved, the operating system became the bottleneck because the operating system was typically servicing requests from various applications when the network was available for data transfers to or from the data storage device.
To avoid the operating system bottleneck, data storage devices were developed that directly attached to a network, i.e., network data storage devices. Due to this direct attachment, any computer in the networked computer system is able to directly communicate with the network storage device.
A further advent has been the development of distributed network data storage in which there are two or more network data storage devices are utilized and a mechanism exists for defining a logical volume, i.e., a unit of data storage, that physically extends over the two or more data storage devices. Consequently, to computers in a networked computer system, the logical volume appears to be a single storage device. An example of a network computer system that employs distributed network storage is comprised of: (a) two fibre channel disk drives; (b) a computer; and (c) a network for facilitating data transfers between the drives and the computer. The computer comprises a driver (a program that allows an operating system to communicate with a device) for each of the drives and a logical volume manager that controls the drivers so as to define a logical or virtual volume that extends over the two fibre channel disk drives.
The present invention is directed to a system for use in achieving distributed network data storage in a network and that provides the flexibility to achieve additional functionality, such as the ability to scale the data storage, stripe data, replicate data, migrate data, snapshot data, and provide shared access.
In one embodiment, the system is comprised of a storage server system that is, in turn, comprised of one or more data storage servers which provide data storage and data transfer capability for application clients in a networked computer system. An application client is a computer in a networked computer system that is or will execute a particular application program (e.g., a data base management program) that requires or will likely require data storage and transfer capability. A data storage server is comprised of a data storage device (e.g., a disk drive) and a network interface for communicating, via a network, with an application client and a management storage server.
The system is further comprised of a management storage server system that is, in turn, comprised of one or more management storage servers which each provide certain storage management functionality relative to any application clients and the storage server system. A management data storage server is comprised of a network interface for communicating, via a network, with an application client and the storage servers in the storage system. A management data storage server is further comprised of a data storage device (e.g., a disk drive or tape drive).
Each of the management storage servers comprises a data storage configuration identifier that is used to coordinate the operation of the storage servers. The value of the identifier is indicative of an allocation of data storage within the storage server system at a particular point in time. In one embodiment, the value is a time stamp. Other types of values are feasible. The allocation of data storage within the storage server system comprises defining any number virtual or logical volumes that are each distributed over one or more of the storage servers. Each of the management storage servers is capable of providing a first value for the identifier to an application client. For example, a management storage server provides a first value for the identifier to an application client as part of the allocation of data storage to the application client. Further, each of the management storage servers is capable of providing an updated value for the identifier to each of the storage servers after there is a change in allocation of data storage within the storage server system.
The storage servers use the identifier in deciding whether or not to carry out a data related request from an application client. To elaborate, a data related request that a storage server receives from an application client comprises the most recent value of the data storage configuration identifier in the application client""s possession. The storage server compares the most recent value of the identifier in its possession to the value of the identifier associated with the received request. If the values are the same, both the application client and the storage server understand the data storage allocation to be the same. In this case, the storage server proceeds with the processing of the data related request. If, however, the value of the identifier in the storage servers possession and the value of the identifier associated with the request are different, the application client and the storage server understand the data allocation to be different. Stated differently, the application client is operating based upon an out of date data storage allocation. In this case, the storage server does not proceed with the processing of the request because to do so might corrupt data. In one embodiment, the storage server causes an error to be generated that is provided, via the network, to a management storage server. In response, the management storage server provides the application client with an updated identifier that the application client is then capable of utilizing to retry the data related requested, if desired.