Embodiments presented herein generally relate to data management, and more specifically, to identifying relationship and importance information for a given item in a data set.
Managing large amounts of data is a known issue in many organizations. For example, in a business-to-business (B2B) or a business-to-consumer (B2C) setting, organizations use an electronic data interchange (EDI) communications between two organizations can rise to large volumes of various transaction documents, such as purchases, payments, and invoices. In addition, a B2B entity may exchange numerous transactions of varied proportions with different B2B partners on a daily basis.
Generally, EDI transactions reside in a relational database or file system. Further, because an entity may generate numerous transactions on a regular basis, an organization may purge older transactions or maintain those transactions in an archive database, e.g., according to some retention policy. However, one concern regarding this approach is that many retention policies only make archival decisions based on date. Given that some past communications may be vital for an organization, using date as a single criterion may be ineffective. A user may manually flag a given transaction to prevent the transaction from being purged, however, doing so may be time-consuming and inconsistently applied.