Increasing efforts in computer automation and digital data processing have resulted in a significant increase of companies' revenues being dependent on computer-generated data and digital end products. For instance, pictures or movies are no longer created and kept in analog format, but they are created, stored and sold in digital format. The creation and exchange of information from databases (e.g. marketing or medical databases) is no longer done in paper copy, but done in digital format. The research and development of products (e.g. semiconductors, cars, airplanes or other sophisticated systems) is highly dependent on computer simulation, processing and manufacturing.
In a large number of the different types of industries, companies tend to generate vast amounts of digital data in a dynamic and continuous fashion when developing products. In all stages of the development, digital data associated with these products needs to be stored and managed. Furthermore, companies tend to generate vast amounts of digital end products as a result of these developments, which also need to be stored so that they can be accessed when purchased by or exchanged with clients.
The dependency on digital processing and digital data is accompanied with an increasing demand in data storage consumption on multiple data storage resources. Furthermore, companies with multiple concurrent projects having a fixed or finite amount of storage space often find themselves with the daunting task of coordinating data storage consumption and use of these data storage devices. Inefficient use of data storage resources often leads to the purchase or acquisition of additional data storage resources, which will compound the data coordination problems due to increased cost and time consumption involved in management (e.g. finding, retrieval etc.), backup and recovery of data on these data storage devices.
An approach to balance cost of data storage with the cost of network performance in a distributed network is discussed by JC Chuang and MA Sirbu in a paper entitled “Distributed network storage service with quality-of-service guarantees” and published in the Proceedings of the Internet Society INET '99 Conference, June 1999, pp. 1-26. To balance the cost of data storage with the cost of network performance, two techniques are proposed, i.e. caching and replication. The paper by Chuang and Sirbu promotes consuming additional storage by replicating data throughout the network, as opposed to using faster networks with a single copy of data, as a mechanism to meet performance objectives. (See also a product called “NetCache” by Network Appliance Inc. published on www.netapp.com/products/#netcache).
In order to better manage data storage from a user or administrator point of view, the prior art teaches different solutions that can generally be classified as two approaches. One prior art approach relates to the abstraction of the multiple data storage devices as one single appearing “virtual” data storage device (See for instance U.S. Pat. No. 6,438,642 assigned to KOM Networks Inc.; U.S. Pat. No. 6,421,711 assigned to EMC Corporation; U.S. Pat. No. 6,415,373 assigned to Avid Technology Inc.; or U.S. Pat. No. 6,401,183 assigned to Flash Vos Inc.). In the art this approach is also referred to as block level virtualization or abstraction and improves the management of the actual storage devices, but not the actual data stored on these data storage devices. Although this approach is beneficial to a system administrator in managing the data storage devices, it gives very little intelligence or knowledge to what data is actually stored on these devices.
Another prior art approach relates to the abstraction of a vast amount of files that are stored on different data storage devices as one single file system (See for instance U.S. Pat. No. 6,185,574 assigned to 1 Vision Inc. and NuView Inc. in a paper entitled “Aggregate and File System Management with NuView Storage X” and published on www.nuview.com). In the art this is also referred to as file level virtualization. This approach for instance allows servers to share data among different data storage devices. It would provide more intelligence or knowledge than block level virtualization or abstraction, however it would still lack the organization and possibility to coordinate files among the different users at a higher level of intelligence to make important decisions according to business objectives.
Accordingly, there is a need to develop new systems and methods that would allow companies to more efficiently manage and enforce the storage of vast amounts of digital data according to important business decisions and objectives.