Proliferation of cloud-based content management services and rapid adoption of cloud-based collaboration have impacted the way personal and corporate electronically stored information objects (e.g., files, images, videos, etc.) are stored, shared and managed. One benefit of using cloud-based platforms is the ability to securely create and share large volumes of content among trusted collaborators. For example, a large enterprise with thousands of users (e.g., employees) and many terabytes of content might use a cloud-based content storage platform to efficiently and securely facilitate content access to various individual users and/or collaborative groups of users. The users can have access to content the users own (e.g., create, first upload to the cloud-based platform, etc.) and content shared with the users. In all cases, the users expect their content to be available when access is requested.
The cloud-based content storage provider can provide such data availability at least in part by implementing certain data retention policies. For example, data retention techniques can be used to manage selective deletion of content and/or manage purges of data on a schedule, which can be set in line with regulatory obligations (e.g., government regulations). In many cases, data retention policies can be specified according to a service level agreement with an enterprise. In other cases, data retention might be influenced by other factors, such as a “legal hold”. When under a legal hold order, an organization has a duty to preserve relevant information when it learns, or reasonably should have learned of pending or threatened litigation, or of a regulatory investigation.
In order to comply with hold policy or hold order preservation obligations, the organization should inform records custodians of the respective custodians' duty to preserve relevant information. Organizational use of cloud services introduces the need for techniques that go beyond managing file cabinets and boxes of paper. A hold that is acted on by an organization might extend to relevant information in the form of materials that are stored as cloud-based stored content. A hold might pertain to cloud-based stored content that is associated with particular users, and/or pertaining to cloud-based activities that have occurred within a certain period of time, and/or might pertain to certain subject matter.
Unfortunately, legacy techniques for implementing holds on cloud-based shared content are limited at least as pertaining to efficiently managing a selected portion of the content governed by the parameters of a given hold order. Specifically, some approaches might apply a hold order on any content associated with users identified as participants in the hold order. However, some of the users might have permission (e.g., by ownership, by collaboration, by group membership, etc.) to access many thousands of content objects (e.g., folders, files, files versions, etc.) even though only a small percentage of those many thousands of content objects are pertinent to the parameters of the hold order.
For example, a hold order might merely pertain to objects actually accessed (e.g., opened, viewed, etc.) by the user. In such cases, large volumes of data might be unnecessarily marked, sequestered and/or retained longer than needed. Unnecessarily marking and/or sequestering and/or retaining an unnecessarily large amount of data can cause decreased performance and/or increased storage demands on the cloud-based storage system. Some legacy approaches apply a hold order to any or all folders and/or files accessible by certain users associated with the hold order.
Other legacy approaches apply a hold order to any or all folders and files that were created and/or modified over the time period associated with the hold order. However, in highly collaborative environments, certain files can have many versions, some of which versions might have been created or modified during time periods outside of the bounds of the hold order time period. Acting on a hold order in a manner that such versions outside the hold order time period are marked and/or sequestered and/or retained and/or are otherwise acted upon can cause decreased performance and/or can cause increased storage demands on the cloud-based storage system. Moreover, legacy approaches fail to reconcile conflicts that arise between hold policies, such as when two or more policies are in force at the same time but are in conflict with respect to their specifications or are in conflict with respect then-current application of their specifications.
What is needed is a technique or techniques to improve over legacy and/or over other considered approaches. Some of the approaches described in this background section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.