A primary copy of data is generally a production copy or other “live” version of the data which is used by a software application and is generally in the native format of that application. Primary copy data may be maintained in a local memory or other high-speed storage device that allows for relatively fast data access if necessary. Such primary copy data is typically intended for short term retention (e.g., several hours or days) before some or all of the data is stored as one or more secondary copies, for example, to prevent loss of data in the event a problem occurred with the data stored in primary storage.
To protect primary copy data or for other purposes, such as regulatory compliance, secondary copies (alternatively referred to as “data protection copies”) can be made. Examples of secondary copies include a backup copy, a snapshot copy, a hierarchical storage management (“HSM”) copy, an archive copy, and other types of copies.
A backup copy is generally a point-in-time copy of the primary copy data stored in a backup format as opposed to in native application format. For example, a backup copy may be stored in a backup format that is optimized for compression and efficient long-term storage. Backup copies generally have relatively long retention periods and may be stored on media with slower retrieval times than other types of secondary copies and media. In some cases, backup copies may be stored at an offsite location.
After an initial, full backup of a data set is performed, periodic, intermittent, or continuous incremental backup operations may be subsequently performed on the data set. Each incremental backup operation copies only the primary copy data that has changed since the last full or incremental backup of the data set was performed. In this way, even if the entire set of primary copy data that is backed up is large, the amount of data that must be transferred during each incremental backup operation may be significantly smaller, since only the changed data needs to be transferred to secondary storage. Combined, one or more full backup and subsequent incremental copies may be utilized together to periodically or intermittently create a synthetic full backup copy. More details regarding synthetic storage operations are found in commonly-assigned U.S. patent application Ser. No. 12/510,059, entitled “Snapshot Storage and Management System with Indexing and User Interface,” filed Jul. 27, 2009, now U.S. Pat. No. 7,873,806, which is hereby incorporated by reference herein in its entirety.
An archive copy is generally a copy of the primary copy data, but typically includes only a subset of the primary copy data that meets certain criteria and is usually stored in a format other than the native application format. For example, an archive copy might include only that data from the primary copy that is larger than a given size threshold or older than a given age threshold and that is stored in a backup format. Often, archive data is removed from the primary copy, and a stub is stored in the primary copy to indicate its new location. When a user requests access to the archive data that has been removed or migrated, systems use the stub to locate the data and often make recovery of the data appear transparent, even though the archive data may be stored at a location different from the remaining primary copy data.
Many countries explicitly or impliedly regulate the retention of data for organizations operating within those countries. For example, in 2005 Italy adopted a European Union Directive on Privacy and Electronic Communications and requires Internet service providers to retain all data for at least 12 months. In response to the same European Union Directive, Denmark began requiring all telephone and Internet providers to log certain data regarding the communication through their systems, e.g., caller phone numbers, communication cells used for telephone calls, senders internet protocol (IP) addresses, and receiver IP addresses. Until Germany's high court overturned the law in 2010, for two years Germany required any communications data, such as email messages, to be retained for at least 6 months. Other countries, such as the United States, do not have explicit regulations in place for which data must be retained, but instead punish organizations for failure to retain or destroy data in a predetermined and systematic way. For example, a court in the United States determined that a college was negligent for deleting an email mailbox of a former employee because the college appeared to destroy the mailbox in a way that was inconsistent with a college-wide and systematic policy. The simple deletion of the email mailbox cost the college approximately $750,000.
For information technology (IT) groups of companies and organizations operating in within multiple countries, implementing data retention policies for each country can be challenging, especially when employees travel as part of their work. As an example, consider an employee who travels with a mobile device, such as a laptop, from the U.S. to Denmark for a week, to Italy for a week, and then back to the U.S. If an IT group implements a data retention policy that complies with the most restrictive regulation of any country the employee works within, many more data storage resources may be consumed than are required in the least restrict regulations. However, if members of the IT group fail to properly retain data in compliance with each country's regulations, fines for non-compliance against the company or organization could become costly. Additionally, dedicating members of an IT group to the task of tracking the travels of each employee for the purpose of changing the data retention policies of the each employee's mobile device may increase overhead costs associated with operations of the company or organization.
The need exists for systems and methods that overcome the above problems, as well as systems and methods that provide additional benefits. Overall, the examples herein of some prior or related systems and methods and their associated limitations are intended to be illustrative and not exclusive. Other limitations of existing or prior systems and methods will become apparent to those of skill in the art upon reading the following Detailed Description.