A datastore may be a repository for storing, managing, and distributing electronic data, and may store any of the data resources produced and/or utilized by an individual and/or an organization. Each data resource is stored as a series of electronic and/or magnetic bits in a memory (e.g., a hard disk, a solid state drive, a random access memory). Data resources may include any electronic data such as files, documents, media like music and video, records, sensor data, user profiles, and/or portions of data of each. Each of the data resources may have a rich set of relationships and/or interactions. For example, a user associated with a first user profile (e.g., a first data resource) may request to view an electronic report (e.g., a second data resource) made by a user associated with a second user profile (e.g., a third data resource).
The set of bits that make up the data resource may generally be copied by a data processing system such as a computer to yield two instances: an original data resource and a copied data resource. As a result, the original set of bits (e.g., the original data resource) is no longer unique. Copying in this way generally occurs when the data resource is communicated to another computer, and thus uniqueness of the original data resource does not survive a transmission event between the computer storing the original data resource and the computer receiving the copied data resource (e.g., emailing a document as an attachment, downloading a music file from a web page).
In general, it may be difficult to define computing processes that distinguish the original data resource from copies, as the set of bits that comprise the copies may be identical and/or virtually indistinguishable (whether the set of bits reside on the same computes are spread across a network of computers). At the same time, copies that purport to be the same by retaining identical identifying data (e.g., a file name, an identifying field) may begin to change in other aspects not readily apparent (e.g., without analyzing a contents of the data resource). Therefore, inconsistencies and/or a progressing divergence may arise between the set of bits of the original data resource and the set of bits of the copy.
Indistinguishable copies and/or purportedly identical but inconsistent copies may increase the difficulty in managing and/or administering the datastore. It may require significant computing overhead to sort, find, and delete some or all copies of the data resource when requested by a user, especially where the copies of the data resource are spread across several nodes of a distributed datastore and/or a network of computers. Undeleted and/or unaccounted for data resources may be a security risk, for example when copies of a sensitive data resource persist in a vulnerable portion of the datastore after a user believes the data resource to have been deleted or placed into a secured partition of the datastore.
Further, identical copies and/or purportedly identical but inconsistent copies may make it difficult to analyze the datastore. First, interactions and/or relationships between data resources of the datastore may be difficult to observe. Where several indistinguishable copies of the data resource exist, it may also be hard to create a consistent and accurate accounting record that shows which set of users accessed a particular data resource and how it was utilized (e.g., used or managed). Schemes external to the datastore may be used for building accounting records of the data resources, and these potentially complex external schemes may expend significant computing overhead in an effort to ensure fidelity in observing and recording utilization. Complex schemes external to the datastore may prevent rapid response to data security breaches and may preclude efficient post-breach analysis.
An inability to distinguish between the original data resource and the copy may limit the economic value of the set of bits of the original data resource. First, the transfer of ownership of the data resource may be less meaningful when it is unclear what indistinguishable copies of the data resource exist. Similarly, difficulty in distinguishing between the original and the copy may make it difficult for computing processes such as an application program to enforce agreements regarding the data resource that are made between a first user and a second user. For example, it may be difficult to define computing processes that provide temporary use and/or limited use of the set of bits of the original data resource when an essentially indistinguishable copy is made by a computer.
An inability to distinguish between original data resources and copied data resources may make managing information within a datastore difficult. Audit records may be incomplete and analyzing data of the datastore may be complex, limited, and may require significant computing overhead. Where computing processes cannot distinguish between the original of the data resource and the copy of the data resource, an individual and/or an enterprise may be at greater risk for a security breach and may not be able to derive economic value from the transfer of and/or controlled use of the data resource.