The present invention relates to retention of forensic information, and more specifically, to methods, systems and computer programs for selective retention of data in long-term storage.
Increasingly larger amounts of electronic data are generated and shared within enterprises and across networks. Retaining such large amounts of data can be useful for a variety of purposes, including for forensic purposes, business purposes, or security purposes. For example, security monitoring and forensic investigation require the ability to replay network activity between devices on a network. To accomplish this, large amounts of data, including full packets of data or a data flow across a network, are captured. In a monitored network, the large amounts of data can be captured and saved in a high capacity data system as the data flows across a network across a network interface.
Retaining such data requires an ever increasing storage capacity and can also detrimentally impact performance of computing systems. Conventional data retention systems can rely upon time based decisions on whether to discard captured data or can otherwise use manual analysis and inspection of historical data which can be tedious, time consuming, and cost prohibitive. At the same time, much of the data that is initially stored or captured is not needed. Moreover, over time and as conditions change, some data can become less valuable and more amenable to deletion while other data retains its value for forensic, security, or business purposes. Thus, time based retention of such data could lead to undesirable loss of valuable data.