Distributed computing architectures enable large computational and data storage and retrieval operations to be performed by a number of different computers, thus reducing the time required to perform these operations. Distributed computing architectures are used for applications where the operations to be performed are complex, or where a large number of users are performing a large number of transactions using shared resources. A distributed shared storage system is a kind of distributed computing architecture
If a distributed shared storage system is used to provide high bandwidth real-time media data, such as video data, that is shared by a large number of users, several complexities arise. In such an application, the high bandwidth real-time media data is distributed among multiple storage devices or servers. Multiple client applications or machines may access the data. In such an application, data may be divided into blocks and distributed among the storage devices.
There are many ways to distribute data among multiple storage devices. For example, the data may be distributed randomly, pseudorandomly, are simply sequentially. However, it has been shown that performance can be reduced if two or more files are distributed with the same pattern, or if two adjacent blocks from a single file are stored on the same server. Thus, a random, pseudorandom or other irregular pattern that is, or has a high likelihood of being, unique for each file is generally desirable.
While it is simple to produce a random or pseudorandom sequence or other irregular pattern using an appropriate algorithm, most such algorithms apply an algorithm iteratively, starting with a seed value, to generate each value in the sequence. Thus, the computation of the nth value in the sequence requires n computations. Thus, access to the nth block of data in a file would require n computations to determine its storage location. Such computations may be avoided if the sequence is stored. However, such sequences are generally long sequences, and such a sequence would have to be stored for each file in the shared storage system.
An additional problem to be addressed in a large, shared storage system is what should be done in the event that storage devices fail, are added or are removed from the system. The technique for distributing data among the various storage elements also should be resilient to changes in the system.