A portion of the disclosure of this patent document may contain material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document of the patent disclosure as it appears in the United States Patent and Trademark Office patent file or records, but otherwise, reserves all copyright rights whatsoever. The following notice applies to the software and data and described below, inclusive of the drawing figures where applicable: Copyright(copyright)2000, Undoo Technologies.
The present invention relates, in general, to the field of hash file systems and commonality factoring systems. More particularly, the present invention relates to a system and method for determining a correspondence between electronic files in a distributed computer data environment and particular applications therefor.
Economic, political, and social power are increasingly managed by data. Transactions and wealth are represented by data. Political power is analyzed and modified based on data. Human interactions and relationships are defined by data exchanges. Hence, the efficient distribution, storage, and management of data is expected to play an increasingly vital role in human society.
The quantity of data that must be managed, in the form of computer programs, databases, files, and the like, increases exponentially. As computer processing power increases, operating system and application software becomes larger. Moreover, the desire to access larger data sets such as multimedia files and large databases further increases the quantity of data that is managed. This increasingly large data load must be transported between computing devices and stored in an accessible fashion. The exponential growth rate of data is expected to outpace the improvements in communication bandwidth and storage capacity, making data management using conventional methods even more urgent.
Many factors must be balanced and often compromised in conventional data storage systems. Because the quantity of data is extremely large, there is continuing pressure to reduce the cost per bit of storage. Also, data management systems should be scaleable to contemplate not only current needs, but future needs as well. Preferably, storage systems are incrementally scaleable so that a user can purchase only the capacity needed at any particular time. High reliability and high availability are also considered as data users are increasingly intolerant of lost, damaged, and unavailable data. Unfortunately, conventional data management architectures must compromise these factors so that no one architecture provides a cost-effective, reliable, high availability, scaleable solution.
Conventional RAID (Redundant Array of Independent Disks) systems are a way of storing the same data in different places (thus, redundantly) on multiple storage devices such as hard disks. By placing data on multiple disks, input/output (xe2x80x9cI/Oxe2x80x9d) operations can overlap in a balanced way, improving performance. Since the use of multiple disks increases the mean time between failure (xe2x80x9cMTBFxe2x80x9d), storing data redundantly also increases fault-tolerance. A RAID system relies on a hardware or software controller to hide the complexities of the actual data management so that a RAID system appears to an operating system as a single logical hard disk. However, RAID systems are difficult to scale because of physical limitations in the cabling and controllers. Also, the availability of RAID systems is highly dependent on the functionality of the controllers themselves so that when a controller fails, the data stored behind the controller becomes unavailable. Moreover, RAID systems require specialized, rather than commodity hardware, and so tend to be expensive solutions.
NAS (network-attached storage) refers to hard disk storage that is set up with its own network address rather than being attached to an application server. File requests are mapped to the NAS file server. NAS may provide transparent I/O operations using either hardware or software based RAID. NAS may also automate mirroring of data to one or more other NAS devices to further improve fault tolerance. Because NAS devices can be added to a network, they enable scaling of the total capacity of the storage available to a network. However, NAS devices are constrained in RAID applications to the abilities of the conventional RAID controllers. Also, NAS systems do not enable mirroring and parity across nodes, and so are a limited solution.
In addition to data storage issues, data transport is rapidly evolving with improvements in wide area network (xe2x80x9cWANxe2x80x9d) and internetworking technology. The Internet, for example, has created a globally networked environment with almost ubiquitous access. Despite rapid network infrastructure improvements, the rate of increase in the quantity of data that requires transport is expected to outpace improvements in available bandwidth.
Philosophically, the way data is conventionally managed is inconsistent with the hardware devices and infrastructures that have been developed to manipulate and transport data. For example, computers are characteristically general-purpose machines that are readily programmed to perform a virtually unlimited variety of functions. In large part, however, computers are loaded with a fixed, slowly changing set of data that limit their general-purpose nature to make the machines special-purpose. Advances in processing speed, peripheral performance and data storage capacity are most dramatic in commodity computers. Yet many data storage solutions cannot take advantage of these advances because they are constrained rather than extended by the storage controllers upon which they are based. Similarly, the Internet was developed as a fault tolerant, multi-path interconnected network. However, network resources are conventionally implemented in specific network nodes such that failure of the node makes the resource unavailable despite the fault-tolerance of the network to which the node is connected. Continuing needs exist for high availability, high reliability, highly scaleable data storage solutions.
Disclosed herein is a system and method for a computer file system that is based and organized upon hashes and/or strings of digits of certain, different, or changing lengths and which is capable of eliminating or screening redundant copies of the blocks of data (or parts of data blocks) from the system. Also disclosed herein is a system and method for a computer file system wherein hashes may be produced by a checksum generating program, engine or algorithm such as industry standard Message Digest 4 (xe2x80x9cMD4xe2x80x9d), MD5, Secure Hash Algorithm (xe2x80x9cSHAxe2x80x9d) or SHA-1 algorithms. Further disclosed herein is a system and method for a computer file system wherein hashes may be generated by a checksum program, engine, algorithm or other means that generates a probabilistically unique hash value for a block of data of indeterminate size based upon a non-linear probablistic mathematical algorithm or any industry standard technique for generating pseudo-random values from an input text of other data/numeric sequence.
The system and method of the present invention may be utilized, in a particular application disclosed herein, to automatically factor out redundancies in data allowing potentially very large quantities of unfactored storage to be often reduced in size by several orders of magnitude. In this regard, the system and method of the present invention would allow all computers, regardless of their particular hardware or software characteristics, to share data simply, efficiently and securely and to provide a uniquely advantageous means for effectuating the reading, writing or referencing of data. The system and method of the present invention is especially efficacious with respect to networked computers or computer systems but may also be applied to isolated data storage with comparable results.
The hash file system of the present invention advantageously solves a number of problems that plague conventional storage architectures. For example, the system and method of the present invention eliminates the need for managing a huge collection of directories and files, together with all the wasted system resources that inevitably occur with duplicates, and slightly different copies. The maintenance and storage of duplicate files plagues traditional corporate and private computer systems and generally requires painstaking human involvement to xe2x80x9cclean up disk spacexe2x80x9d. The hash file system of the present invention effectively eliminates this problem by eliminating the disk space used for copies and nearly entirely eliminating the disk space used in partial copies. For example, in a traditional computer system copying a gigabyte directory structure to a new location would require another gigabyte of storage. In particular applications, the hash file system of the present invention reduces the disk space used in this operation by up to a hundred thousand times or more.
Currently, some file systems have mechanisms to eliminate copies, but none can accomplish this operation in a short amount of time which, in technical terms, means the system factors copies in O(l) (xe2x80x9con the order of constant timexe2x80x9d) time, even as the system scales. This means a unit of time that is constant as opposed to other systems that would require O(N**2), O(N) or O(log(N)) time, meaning time is related to the amount of storage being factored. Factoring storage in non-constant time may be marginally satisfactory for systems where the amount of storage is small, but as a system grows to large sizes, even the most efficient non-constant factoring systems become untenable. The hash file system of the present invention is designed to factor storage on a scale never previously attempted and in a first implementation, is capable of factoring 2 million petabytes of storage, with the ability to expand to much larger sizes. Existing file systems are incapable of managing data on such scales.
Moreover, the hash file system of the present invention may be utilized to provide inexpensive, global computer system data protection and backup. Its factoring function operates very efficiently on typical backup data sets because computer file systems rarely change more than a few percent of their overall storage between each backup operation. Further, the hash file system of the present invention can serve as the basis for an efficient messaging (e-mail) system. E-mail systems are fundamentally data copying mechanisms wherein an author writes a message and sends it to a list of recipients. An e-mail system implements this xe2x80x9csendingxe2x80x9d operation effectively by copying the data from one place to another. The author generally keeps copies of the messages he sends and the recipients each keep their own copies. These copies are often, in turn, attached in replies that are also kept (i.e. copies of copies). The commonality factoring feature of the present invention can eliminate this gross inefficiency while transparently allowing e-mail users to retain this familiar copy-oriented paradigm.
Because, as previously noted, most data in computer systems rarely change, the hash file system of the present invention allows for the reconstruction of complete snapshots of entire systems which can be kept, for example, for every hour of every day they exist or even continuously, with snapshots taken at even minute (or less) intervals depending on the system needs. Further, since conventional computer systems often provide limited versioning of files (i.e. Digital Equipment Corporation""s VAX(copyright) VMS(copyright) file system), the hash file system of the present invention also provides significant advantages in this regard. Versioning in conventional systems presents both good and bad aspects. In the former instance, it helps prevent accidents, but, in the latter, it requires regular purging to reduce the disk space it consumes. The hash file system of the present invention provides versioning of files with little overhead through the factoring of identical copies or edited copies with little extra space. For example, saving one hundred revisions of a typical document typically requires about one hundred times the space of the original file. Using the hash file system disclosed herein, those revisions might require only three times the space of the original (depending on the document""s size, the degree and type of editing, and external factors).
Still other potential applications of the hash file system of the present invention include web-serving. In this regard, the hash file system can be used to efficiently distribute web content because the method of factoring commonality (hashing) also produces uniform distribution over all hash file system servers. This even distribution permits a large array of servers to function as a gigantic web server farm with an evenly distributed load. In other applications, the hash file system of the present invention can be used as a network accelerator inasmuch as it can be used to reduce network traffic by sending proxies (hashes) for data instead of the data itself. A large percentage of current network traffic is redundant data moving between locations. Sending proxies for the data would allow effective local caching mechanisms to operate, possibly reducing the traffic on the Internet by several orders of magnitude.
As particularly disclosed herein, the hash file system and method of the present invention may be implemented using 160 bit hashsums as universal pointers. This differs from conventional file systems which use pointers assigned from a central authority (i.e. in Unix a 32 bit xe2x80x9cvinodexe2x80x9d is assigned by the kernel""s file systems in a lock-step operation to assure uniqueness). In the hash file system of the present invention, these 160 bit hashsums are assigned without a central authority (i.e. without locking, without synchronization) by a hashing algorithm.
Known hashing algorithms produce probabilistically unique numbers that uniformly span a range of values. In the case of the hash function SHA-1, that range is between 0 and 10e48. This hashing operation is done by examining only the contents of the data being stored and, therefore, can be done in complete isolation, asynchronously, and without interlocking.
Hashing is an operation that can be verified by any component of the system, eliminating the need for trusted operations across those components. The hash file system and method of the present invention disclosed herein is, therefore, functional to eliminate the critical bottleneck of conventional large scale distributed file systems, that is, a trusted encompassing central authority. It permits the construction of a large scale distributed file system with no limits on simultaneous read/write operations, that can operate without risk of incoherence and without the limitation of certain conventional bottlenecks.