1. Technical Field
Present invention embodiments relate to analyzing data sources for inactive data, and more specifically, to analyzing data sources by applying information from a profile to identify inactive data within those data sources.
2. Discussion of the Related Art
Organizations have traditionally grappled with how much data to store largely due to the overall cost of storage devices. For example, many organizations historically were storing less than one terabyte of data, with the physical amount of stored data effectively limited by the cost of storage.
In contrast, since data storage has become relatively inexpensive and is fully complemented by a seemingly inexhaustible supply of data as a resource, organizations are less concerned about over-preservation of data. For example, a terabyte of disk space may currently be purchased for a fraction of its former cost. Accordingly, organizations are now actively storing hundreds of terabytes, and in some cases petabytes or exabytes, of data.
Data is generally recognized by corporate personnel as a strategic resource from which competitive advantages can be attained. Accordingly, organizations are entering a mode of collecting and storing all corporate data, without regard to derived business value.
Currently, identifying data that is not actively consumed by an organization is largely based upon a “feeling” from various corporate personnel, or may be based upon input from an Information Technology (IT) department that evaluates gross data consumption based upon general portfolios of supported software. Such subjective methods are not reliable and may not identify key documents or key data that could be leveraged, as well as may be subject to user bias, e.g., bias from a user role in an organization.