1. Field of the Invention
The present invention relates generally to computer data storage systems, and more particularly to data networks having plural network-attached storage devices for distributed storage of data.
2. Description of the Related Art
Storage area networks (SANs) permit multiple computers, referred to herein as xe2x80x9cprincipalsxe2x80x9d, to access data that is stored across multiple storage devices (such as disks, tapes, or DVDs). In such a distributed system, because each principal can directly access each storage device as required, no central server is required. Consequently, no bottlenecks exist through which all data must be sent or received. Accordingly, a comparatively high rate of data can be sent and received in SANs. Moreover, SANs are robust in the sense that they can operate acceptably should one or a few machines fail, whereas a storage system that depends on a central server cannot operate if the server fails.
SANs do, however, introduce considerations that are either absent in central server systems or are less complicated (hence more easily resolved) in central server systems. These considerations include authenticating that a principal requesting a data access (e.g., a read or write) is authorized to perform the access. Relatedly, in SANs it is difficult to fence off principals from data for which they have been previously authenticated but for which they are no longer authenticated. Also, in SANs the consideration of serialized access to objects becomes an issue, because the storage devices in SANs are aware only of data blocks, not data objects, and it consequently is difficult for the storage device to protect against race conditions and data inconsistency when multiple computers access the same object. Further, without a central server managing storage access, it is difficult to guarantee distributed cache coherency, i.e., it is difficult to guarantee that local copies of data retained by principals to be written into their respective caches are identical with each other.
Despite the fact that storage devices are associated with their own onboard processors, these processors tend to have limited processing power. The present invention recognizes that as a consequence, conventional solutions to the above issues, such as, for example, using distributed shared memory for cache coherency or running a distributed authentication service at each storage node, are less than optimum and indeed are often not viable. Further, implementing conventional solutions requires the sharing of large amounts of data between storage devices, consequently limiting the scalability of SANs to comparatively small data storage systems. Accordingly, the present invention recognizes a need for implementing a SAN while maintaining data consistency and cache coherency in the SAN and while providing for the authentication of principals, without requiring that the storage devices know the identity of the requester or be aware of data objects, files, or metadata.
The invention is one or more general purpose computers programmed according to the inventive steps herein to manage data access in a data network, such as (but not limited to) a storage area network (SAN). The invention can also be embodied as an article of manufacturexe2x80x94a machine componentxe2x80x94that is used by a digital processing apparatus and which tangibly embodies a program of instructions that are executable by the digital processing apparatus to execute the present logic. This invention is realized in a critical machine component that causes a digital processing apparatus to perform the inventive method steps herein.
The invention can be implemented by a computer system including plural general purpose computers, each of which establishes a xe2x80x9cprincipalxe2x80x9d as used herein. The system further includes a data storage system that in turn includes plural storage devices, such as hard disks. At least one storage area network interconnects the principals and the storage devices, and at least one computer-implemented service communicates with the storage area network for receiving data access requests from the principals and for selectively issuing data class and access authorizations thereto in response to the requests. In accordance with the present invention, the data class and access authorizations are presentable by the principals to one or more of the storage devices to allow a presenting principal to access data classes on the storage devices, with the service managing data access among the storage devices by means of the data class and access authorizations. Notionally, the data class and access authorizations are referred to herein as xe2x80x9cticketsxe2x80x9d, and the data classes are notionally regarded herein as xe2x80x9ccolorsxe2x80x9d. A ticket can be used by a principal only to access the indicated color. Preferably, each ticket includes a cryptographic strong hash or one-way function of a principal""s password. Also, each color is represented by a large integer.
As set forth in detail below, the service includes computer readable code means for determining, prior to issuing a ticket in response to a request from a principal, whether granting a data access request will violate coherence of data in the data storage system. Also, computer readable code means in at least some principals define the distribution of data in the data storage system. Furthermore, computer readable code means are associated with the service for establishing a salt (a random number that is concatenated with a message, key, or other string) for at least some colors, the salts being included in at least some tickets, with each salt being established by a random string.
In the preferred embodiment, the service includes computer readable code means for managing data access among the storage devices by informing one or more principals of a potentially interfering request. The principals include computer readable code means for invalidating local caches or discarding tickets and requesting new tickets in response thereto. Also, the service includes computer readable code means for managing data access among the storage devices by rejecting a potentially interfering request or by queuing the potentially interfering request until it no longer potentially interferes with a prior request. Still further, the service includes computer readable code means for managing data access among the storage devices by revoking one or more tickets issued prior to a predetermined ticket. And, the service can include computer readable code means for instructing the storage devices to ignore tickets identifying one or more predetermined salts, or for instructing the storage devices to ignore tickets identifying one or more predetermined colors. The service also preferably includes computer readable code means for refreshing selected principals after at least some ticket revocations by issuing new tickets to the selected principals.
In another aspect, for a distributed data storage system having plural data storage devices and plural principals accessing the devices, a computer-implemented method is disclosed for managing data access in the data storage system. The method includes notionally associating data elements in the data storage system with colors, and receiving requests for data access from the principals. Also, the method includes selectively issuing tickets in response to the requests, with each ticket authorizing at least one type of data access with respect to at least one color. As intended by the present invention, the tickets are generated or not in response to requests to manage data access among the storage devices on the basis of the colors.
In still another aspect, a computer program device includes a computer program storage device that is readable by a digital processing apparatus, and a program is on the program storage device. The program includes instructions that can be executed by the digital processing apparatus for performing method acts for managing data access in a data storage system having plural data storage devices accessible by plural principals. These acts include grouping data into plural colors and receiving requests from principals for access to the data. The acts also include responding to the requests while maintaining data coherency by selectively issuing and revoking tickets to principals. Each ticket represents at least one data access authorization to at least one color of data.
The details of the present invention, both as to its structure and operation, can best be understood in reference to the accompanying drawings, in which like reference numerals refer to like parts, and in which: