The present invention relates to archiving documents, and more specifically, to archiving Portable Document Format (PDF) reports as individual PDF statements.
Report management systems and archival retrieval systems are frequently used to store large quantities of data. For example, such systems may be used to store credit card statements, bank account statements, utility monthly statements, and similar data. The length of time that such data must be kept may determine the storage medium used, such as hard drive, optical disk, or tape. In any case, it is desirable to store data in a way that makes the data easily and rapidly accessible and also in a manner that minimizes the required storage space.
Archived documents may be stored in various formats, including Advanced Function Presentation, (AFP) and Portable Document Format (PDF). Archived PDF documents may include “statements”, such as bank statements or other documents, as well as “reports”, which comprise a collection of individual statements. When archiving PDF reports containing many statements, these statements may need to be stored as individual PDF documents in order to satisfy performance or functional requirements. Such functional requirements may include, for example, placing legal holds on a subset of statements, or placing one or more of those statements into a work flow process. In these instances, it is necessary to store the individual statements as stand alone documents.
When storing individual PDF statements in this manner, as stand alone documents, PDF “shared resources” need to be duplicated along with each and every statement stored. PDF shared resources are the part of a PDF document that enables the data to be displayed in a particular manner. For example, the shared resources in a PDF report may include overlays that define boxes around certain parts of the text, custom fonts, logos that are placed on each page in the same place, images, etc. However, duplicating each of these shared resources in each archived statement greatly increases storage requirements. Some archival systems are able to accept this requirement. Other archival systems may not extract the statements at all and instead simply archive the entire report as a single entity. In this way storage requirements are not greatly increased, unless a document has to be held or put into a work flow process. However, performance is affected because when a document has to be retrieved for viewing or printing, the entire report has to be retrieved in order to extract the requested statement.