1. Field of the Disclosure
The present application relates to a computer system and a computer-implemented method for generating page-oriented data for printing dynamic documents. Specifically, the present invention relates to a computer system and a computer-implemented method for generating, from variable data stored in a data store, page-oriented data output for printing dynamic documents, whereby the variable data is pre-processed by one or more data processing modules prior to generating the page-oriented data.
2. Related Art
There are many data processing applications where data is retrieved from a data store and processed at various processing stages, before it is passed to a data consumer which generates some output based on this pre-processed data. Typical data stores include local or remote (networked) hard disks, data tape, or removable data carriers such as compact discs (CD), digital versatile disks (DVD) or flash memory devices. Particularly in data processing applications, such as dynamic and high speed printing based on variable data, where different processing modules with different processing functions can be selected and combined freely for pre-processing the data, at each processing stage, the variable data is typically loaded as a complete data set from the data store into the local memory of the computerized processing system. For example, the variable data is loaded completely into random access memory (RAM). After data processing is completed at a processing stage, the processed data is stored back into the data store. Thus, at each processing stage, computer resources and time are used for allocating/de-allocating local memory, and transferring data between the data store and the local memory. In some applications it may be possible to reduce data transfer between the data store and the local memory, if the different processing modules are designed to use the local memory for passing data from one processing stage to the next. Nevertheless, using the local memory for transferring data between processing modules requires additional memory space and necessitates allocation/de-allocation of local memory at each processing stage. However, dynamic printing applications which generate page oriented data output based on variable data input, require typically a high performance throughput for handling high volumes of data records and corresponding page output. For example, printing of variable information letters and/or invoices directed to subscribers of a telecom provider, or other personalized mass mailing applications, may involve hundreds of thousands or even millions of records and corresponding print pages. For each individual page to be printed, page-oriented data is generated from variable data, e.g. address, invoice and/or other custom-oriented information associated with individual persons, whereby the page-oriented data defines a layout of a page including the placement of properly formatted data fields derived from the variable data. For example, the page-oriented data output is in the form of a print stream such as Adobe PDF (Portable Document Format) as defined in ISO 32000-1:2008, Adobe PostScript, IBM AFPDS (Advanced Function Presentation Data Streaming), PPML (Personalized Print Markup Language), IJPDS (InkJet Printer Data Stream), or other representations of two-dimensional documents. On one hand, great flexibility may be obtained from selecting and combining various processing modules for pre-processing the variable data, prior to generating the page-oriented data output, e.g. for filtering, formatting and merging the variable data into a page-oriented layout. On the other hand, performance and throughput are reduced by the data exchange taking place between individual modules and the data store, as well as the allocation and de-allocation of local memory performed at the individual processing stages. However, the page-oriented data output must be generated fast enough to support high speed printing systems. Particularly, the pre-processing and merging of the variable data into the page-oriented layout needs to be performed fast enough to keep the printer system running continuously throughout a defined production period, e.g. several hours or even a whole day. Thus, in the known methods for generating from variable data page-oriented data for printing dynamic documents there is a tradeoff between flexible pre-processing of the variable data at run-time (during production) and fast, continuous printing of the dynamic documents.
U.S. Pat. No. 7,127,520 describes a system for transforming an input data stream of one format into an output data stream of another format. According to U.S. Pat. No. 7,127,520, various input connector modules receive input data streams and are each connected to one of several input queues which store the input data stream. Filters are used to remove irrelevant data from the received input data streams. Several job threads format in parallel the input data streams to produce output data streams which are stored in output queues. Thread job managers detect events in the input data stream and generate messages associated with the detected events. Based on the messages, the output data streams are produced and stored in output queues. For example, a “pageout” process produces page layout for creating documents for printing or faxing.
US 2005/0050442 describes a system for generating customized documents using dynamic information selected from a database based on a customer's business rules and specific transaction data, e.g. a customer's name and/or address. The transaction data is received through batch data files available through FTP (File Transfer Protocol) servers and/or web-enabled interactive ordering systems. When a complete set of content is assembled for a transaction, a dynamic document generator and assembler assembles the customized document.
US 2002/0049702 describes a method for creating a series of customized document instances from a single dynamic variable information document, referred to as dynamic document. A dynamic document includes a dynamic document template with placeholders indicating the location where dynamic objects are to be placed. According to US 2002/0049702, a dynamic document is associated with a plurality of pointers to a plurality of data sources such as database data and media items. Specifically, the values to be associated with a dynamic object are defined in terms of logical tables and attributes of their records, or references to external systems such as a file name or an URL (Uniform Resource Locator) identifying a content object.
EP 0837401 describes a method for creating complex layouts with variable data for high speed variable data printing, e.g. using ink jet or laser jet systems printing on paper moving at the speed of up to 305 meters per minute. A merge software performs data reformatting functions, such as case conversion or word concatenation, and re-flows text based on variable data insertion, associating fields in variable data records with variable data placeholders for appropriate locations in a layout template. User callable program routines can be linked to the merge software for performing custom user functions.
US 2001/0047369 describes a three stage pipeline process for generating dynamic documents. At the first stage, a data iterator performs a data processing task by selecting the next record from a recipients list and computing the set of page layouts and content objects needed for this instance. The result of the computation is forwarded to a document instantiator via a content objects buffer. The data iterator continues the data processing task as long as the recipients list has not been exhausted and the buffer storage is not full. At the second stage, the document instantiator retrieves the next collection of layouts and content objects from the content objects buffer, and employs the appropriate layout engine for creating the specific document instance and code specifying the rendering of the document instance. At the third stage, a merge processor generates an output stream based on the code specifying the rendering of the document instance.