1. Technical Field
Present invention embodiments relate to computerized information analysis, and more specifically, to generating a fingerprint for data including data attributes, and utilizing the fingerprint for analysis of the data when the data is no longer available.
2. Discussion of the Related Art
Data may be stored for a predetermined duration for various purposes. For example, personal information (PI) may include information pertaining to patients, customers, suppliers, citizens, and/or employees. This information is processed in many different types of systems within an enterprise (e.g., Human Resource (HR) systems for employee information, Master Data Management (MDM) systems for customer, supplier or employee information, order entry systems (e.g., e-Commerce platforms), customer or supplier relationship management systems (e.g. SAP CRM, SAP SRM, Siebel CRM, etc.), and order fulfillment systems (e.g., ERP solutions)).
This type of information should be retained for a minimum time period sufficient to satisfy business requirements. Specifically, various regulations exist that demand deletion of this information after a required minimum time period, and provide strict guidelines with respect to persons and/or entities permitted to view and work with the information (e.g., HIPAA in the U.S.; Bundesdatenschutzgesetz (Germany's Federal Data Protection Act); Data Protection Act of 1984 (United Kingdom) covering storage of personal information (PI) only for a limited amount of time; European Union (EU) Directive 95/46/EC on the Protection of Personal Data covers three major areas related to personal information (PI), namely transparency, legitimate purpose, and proportionality, where the last aspect also covers consideration of maintaining personal information (PI) as long as minimally needed; this latter directive also mandates that personal information (PI) must be protected from loss, unauthorized disclosure, and modification while the information is in transit or at rest; etc.).
An unauthorized access or potential compromise of the personal information (PI) may occur prior to deletion of that information. When the potential compromise is detected after deletion of the personal information (PI), the scope and/or consequences of this type of occurrence (e.g., which information has been potentially compromised, which regulations apply, etc.) are difficult to determine due to the unavailability of the affected personal information (PI). This consequently hinders the ability to perform an analysis of the occurrence (e.g., the analysis may inadvertently expose regulated information to persons and/or entities beyond those prescribed by the regulations) and to comply with the appropriate regulations (e.g., since the particular regulations that apply are difficult to determine without the affected personal information (PI)).