1. Technical Field
The disclosure is directed to printing systems, and, more particularity, processing steps for a print job to split the job into its preamble and chunks to enable a parallel system to process jobs with very large preambles efficiently, thereby avoiding performance and system reliability problems, by eliminating the need to copy, move, or process the large preamble more than once.
2. Description of the Related Art
Generating print-ready documents to be printed by a printing system requires acquiring all the information (content, graphics, production specs, etc.) required to view, process and output the desired document in an electronic form understandable by a print engine. Such systems can range from those that are simple and modestly expensive, such as are known to consumer users of personal computer systems, up to commercial printing systems that are capable of generating in the range of hundreds or even thousands of pages per minute in full color. All systems, though, have a high level objective of printing faster.
There are three general approaches which have been applied in the past for accomplishing this objective. First, faster serial processing methods suggest optimizing the software and using faster and more expensive processors. Second, job parallel processing sends separate jobs to separate systems and then prints them on a common printer. In such a job parallel processing system, each job (i.e., jobs 1-3) is taken from a queue and handed to a separate RIP processor to be converted in parallel and then output in serial order (job 1, job 2, job 3). Third, Portable Document Format (“PDF”) based page or chunk parallel systems convert the job to PDF, and then split the PDF file onto pages or chunks, which are converted to print ready form on multiple independent processors, with the job being printed on a common printer. In a page or “chunk” processing system an individual job is taken from the queue and broken down into pages or other divisible “chunks,” with the chunks being sent to multiple RIP processors to be converted in parallel so that individual pages or “chunks” can be output in logical page order (e.g., chunk 1, 2, 3).
Of these general approaches, software optimization has its limits and faster processors are also limited by currently available technology. Job parallel processing may result in poor single job performance, unpredictable job time and reduced throughput when there is only one long job in the queue. The existing third approach (PDF-based solutions) may be slow due to their need to often convert from a different input language into PDF and then write the PDF file into an input spool disk. Between page and chunk parallel systems, page parallel processing has suffered from the inefficiencies of a throughput disadvantage because per job overhead occurs on a per page basis. Thus, “chunk” processing may be the most promising for improvement.
Chunk parallelism is an intermediate level of parallelism between job parallelism and page parallelism. A chunk is a collection of data consisting of at least one page and not more than one job. A chunk may be an integer number of pages less than an entire job but has a startup overhead occurring on a chunk basis as opposed to a per page basis. A more detailed description of “chunk” parallelism can be found in U.S. Pat. Nos. 7,161,705, 6,817,791, U.S. Publication No. 2004/0196497, and U.S. Publication No. 2004/0196496, the disclosures of which are hereby incorporated herein in their entirety, which describe chunk parallelism as an intermediate level of parallelism between job parallelism and page parallelism.
A more detailed description of a job parallel system can be found in U.S. Pat. No. 5,819,014, the disclosure of which is hereby incorporated herein in its entirely, which describes a printer architecture using network resources to create a “distributed” printer controller or translator. By distributing the translators across the network, print jobs may be processed in parallel. Each job is formatted in the system in a particular data type comprising a Page Description Language (“PDI”) such as a PostScript file, ASCII, PCL, etc. A distributed set of the translators is used for each data type, the translators each comprising a plurality of CPUs to simultaneously rasterize each data type. In real time operation, each translator on the network can formulate the rasterized image, which is then fed over the network to the print engine. Job parallelism increases the flexibility of the printing system by allowing slow jobs to be processed while quicker jobs are completed and printing. However, it can be easily appreciated that where the jobs require substantially different processing times, waits will necessarily occur and overall system efficiency will suffer.
A known commercially available system exploiting page parallelism is Adobe® Extreme. In this system, the data input for a print job is normalized into a PDF format and stored on disk. The PDF format is essentially page independent guaranteed and thus facilitates segregating the job into page wits for page parallel processing. A “sequencer” processing node takes the PDF jobs off the disk and writes them back onto a disk again a page at a time as individual files, one file per page. Rasterizing Image Processing nodes (RIP nodes) then convert the files into a print-ready form acceptable by a print engine. It is important to note that in terms of processing efficiency, Adobe Extreme must access the disk twice, thus slowing the system down, and that the RIP nodes can only process a file consisting of a single page. Of course, an entire job may be limited to one page, but when a job is comprised of several pages, Adobe Extreme must sequence it to individual pages only.