1. Field of the Invention
The present invention relates to a shared storage network system and to a method for operating a shared storage network system.
2. Description of the Related Art
The improvement of computer storage devices, e.g. hard disk, and of computer storage management is a main issue in the development of computer technology.
According to the so-called OceanStore system disclosed in [1], a utility infrastructure is designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. OceanStore objects are identified by a globally unique identifier (GUID), which is the secure hash (SHA-1) of the owner's key and some human-readable name.
In the following, the data storing according to the OceanStore concept will be described.
Objects are replicated and stored on multiple servers. A replica from an object is located through one of two routing mechanisms. Objects in the OceanStore are modified through updates. In principle, every update to an OceanStore object creates a new version. OceanStore objects exist in both active and archival forms. An active form of an object is the latest version of its data together with a handle form update. An archival form represents a permanent read-only version of the object.
Routing is realized according to the OceanStore system as described in the following.
There are two routing algorithms co-existing in the OceanStore system. In the first fast, probabilistic routing algorithm, if a query cannot be satisfied by a server, local information is used to route the query to a likely neighbour. In the second slower, reliable hierarchical method, every server in the system is assigned a random node-ID. These node-IDs are then used to construct a mesh of neighbour links. Each link is labelled with a level number that denotes the stage of routing that uses this link. The Nth level links-links for some Node X point at the sixteen closest neighbours whose node-IDs match the lowest N-1 nibbles of Node X's ID and who have different combinations of the Nth nibble. Each object is mapped to a single node whose node-ID matches the object's GUID in the most bits, this node may be called the object's root. If information about the GUID was stored at its root, then anyone could find this information simply by following neighbour links until they reached the root node for the GUID.
Security features are provided by the OceanStore system, as described in the following.
To prevent unauthorized reads, all data are encrypted in the system that is not completely public and the encryption key is distributed to those users with read permission (“restricting readers”). To revoke read permission, the owner requests that replicas to be deleted or re-encrypted with the new key. A recently-revoked reader is still able to read old data from cached copies or from misbehaving servers.
To prevent unauthorized writes, all writes are required to be signed so that well-behaved servers and clients can verify them against an access control list (ACL), so-called “restricting writers”.
In [2], the so-called “PAST” system is disclosed.
PAST is based on a self-organizing, Internet-based overlay network of storage nodes that cooperatively route file queries, store multiple replicas of files and cache additional copies of popular files.
The naming features of PAST will be described in the following.
Storage nodes and files are each assigned uniformly by a distributed identifier. A 160-bit fileID is computed as the secure hash (SHA-1) of the file's name, the owner's public key and a randomly chosen salt. Each PAST node is assigned a 128-bit node identifier. The nodeID assignment is quasi-random (e.g., SHA-1 of the node's public key). The statistical assignment of files to storage nodes approximately balances the number of files stored on each node. However, non-uniform storage node capacities and file size require more explicit storage load balancing.
Routing is realized according to PAST, as explained in the following.
PAST routes an associated message towards the node whose nodeID is numerically closest to the 128 msbs of the fileID among all live nodes. NodeIDs and fileIDs are considered as a sequence of digits with base 2b. A node's routing table is organized into [log2bN] levels with 2b-1 entries each. The 2b-1 entries at level n of the routing table each refer to a node whose nodeID shares the present node's nodeID in the first n digits, but whose (n+1)th digit has one of the 2b-1 possible values other that the n+1 digit in the node's ID. Each node maintains IP address for the nodes in routing table.
PAST provides security features, as explained below.
Each PAST node and each user of the system holds a smart-card (read-only clients do not need a card). A private/public key pair is signed with smartcards issuer's private key for certification purposes.
[3] discloses the so-called Chord system.
The Chord File System (CFS) is a peer-to-peer read-only storage system. CFS servers provide a distributed hash table (Dhash) for block storage. CFS clients interpret Dhash block as a file system. Dhash distributes and caches blocks at a fine granularity to achieve load balance, uses replication for robustness and decreases latency with server selection. Dhash finds blocks using the Chord location protocol, which operates in time logarithmic in the number of servers.
Naming is realized by the Chord system as follows.
Each Chord node has a unique m-bit node identifier (ID) obtained by hashing the node's IP address and a virtual node index. Chord views the IDs as occupying a circular identifier space. Keys are also mapped into this ID space, by hashing them to m-bit key IDs. Chord defines the node responsible for a key to be the “successor” of that keys ID. The successor of an ID j is the node with the smallest ID that is greater than or equal to j (with wrap-around). Chord assigns each server an identifier drawn from the same 160-bit identifier space as block identifiers. These identifiers can be considered as points on a circle. The mapping that Chord implements takes a block's ID and yields the block's successor, the server whose ID most closely follows the block's ID on the identifier circle. When a node n joins the network, certain keys previously assigned to n's successor become assigned to n. When node n leaves the network, all of n's assigned keys are reassigned to its successor.
Data storing is performed according to Chord, as explained in the following.
The publisher inserts the file system's blocks into the CFS system, using a hash of each block's content as its identifier. Then the publisher signs the root block into CFS using the corresponding public key as the root block's identifier. Dhash places a block's replicas at the k servers immediately after the block's successor on the Chord ring.
The Chord system performs routing as described in the following.
A Chord node uses two data structures to perform lookups: a successor list and a finger table. Only the successor list is required for correctness and the finger table accelerates lookups. Every Chord node maintains a list of the identities and IP addresses of its r immediate successors on the Chord ring. If the desired key is between the node and its successor, the latter node is the key's successor, otherwise the lookup can be forwarded to the successor, which moves the lookup strictly closer to its destination. A new node n learns of its successors when it first joins the Chord ring, by asking an existing node to perform a lookup for n's successor. Then, n asks that successor for its successor list.
As explained in the following, security features are included in the Chord system.
Clients name a file system using the public key. They can check the integrity of the root block using that key, and the integrity of blocks lower in the tree with the content-hash identifiers that refer to those blocks. CFS authenticates updates to root blocks by checking that the new block is signed by the same key as the old block. A timestamp prevents replays of old updates. CFS allows updates, but in a way that allows only the publisher of a file system to modify it. A CFS server will accept a request to store a block under either of two conditions. If the block is marked as a content-hash block, the server will accept the block if the supplied key is equal to the SHA-1 hash of the block's content. If the block is marked as a singed block, the block must be signed by a public key whose SHA-1 hash is the block's CFS key.
[4] is related to the SNIA (Storage Network Industry Association) standard which is a layer architecture for a network system of nodes sharing common memory resources.