Modern society relies heavily on computers and computer networks (computer systems) and subsequently the Internet, as it is essentially a large computer system. Managing what person or entity has rights to a particular file is extremely difficult, resource intensive and critical to maintaining the privacy of data. Current implementations of rights management and file identification are so inefficient that large concessions and compromises have been made across the computing field.
Rights management, in accordance with the prior art, is typically done on a per user basis. A user is generally authenticated and is granted rights on an individual or group basis. It is common for files associated with each user to be compared bit for bit or through the use of some algorithm. Common comparison methods include a crc-32 signature, a file size check, and a more robust md5 method that is typically employed for larger files. File comparison is typically done in a manual process on large downloads. A large file would be supplied with an md5 checksum. The user downloads the file and runs an md5 utility on the downloaded file and verifies that the checksums match. Traditionally, separate copies of files are maintained for each user.
One inefficiency introduced when comparisons are performed relates to the process of computer backup. Most backup techniques rely on taking an initial image or full backup of the entire contents of a computer system. Using initial images or full backups may result in very large backups being created during computer backup processes. There are various techniques for reducing the size of backups that exclude parts of the file system and/or do not copy data already backed up. One technique for reducing the size of the backups is to only copy files that do not already exist. This technology utilizes crc-32 as a checksum along with file name designators to determine if a file is already in the repository. Employing the crc-32 technique does not reasonably guarantee the uniqueness of the file as there are many possible combinations of data of the same size that will generate the same crc-32 data. Moreover, using the crc-32 technique has an inherent file rights problem and file identity problem. If duplicate files are not backed up, the backup space is essentially a shared system. With this shared system one must determine which files each backup client has rights to.
Another such inefficiency is in the process of configuration management. Configuration management is the process of managing the configuration of a computer system. This process includes capturing and restoring configuration sets. Configuration sets may contain file structures and configuration information, as well as scripts to update configuration information or manipulate a file system. The inefficiency lies in the inability of the system to guarantee the uniqueness of files. Configuration sets are complete bundles of relevant data. Many configuration sets will have multiple copies of the same files simply because of the complexity of managing files that are not assured to be unique.
The list of inefficiencies with computer storage, management, and data backup systems continues almost indefinitely. For example, a significant amount of inefficiency exists in the process of email storage and other message storage techniques as well as application data storage.