The subject invention relates to printing systems, and, more particularly, processing steps for a print job to split the job into segregated portions to facilitate independent processing of the portions.
Generating print-ready documents to be printed by a printing system requires acquiring all the information (content, graphics, production specs, etc.) required to view, process and output the desired document in an electronic form understandable by a print engine. Such systems can range from those that are simple and modestly expensive such as are well known to consumer users of personal computer systems, up to commercial printing systems that are capable of generating in the range of one hundred pages per minute in full color. All systems though have a high level objective of printing faster.
There are three general approaches which have been applied in the past for accomplishing this objective. First, faster serial processing methods suggest optimizing the software and using faster and more expensive processors. Second, job parallel processing sends separate jobs to separate systems and then prints them on a common printer. Third, Portable Document Format (xe2x80x9cPDFxe2x80x9d) based page parallel systems convert the job to PDF, and then split the PDF file into pages which are converted to print ready form on multiple independent processors, with the job being printed on a common printer. Software optimization has its limits and faster processors are also limited by currently available technology. Job parallel processing results in poor single job performance, unpredictable job time and reduced throughput when there is only one long job in the queue. The existing PDF-based solutions are slow due to their need to often convert from a different input language into PDF and then write the PDF file into an input spool disk. Page parallel processing has suffered from the inefficiencies of a throughput disadvantage because per job overhead occurs on a per page basis.
Accordingly, in the continuing need for improving efficiency and speed in printing systems, there is a need for a system which is not limited to mere job or page parallelism and that can facilitate control and data flow of a print job to the printing system while splitting the print job into a plurality of print job portions, each of which can be processed independently and in parallel. How a print job can be better split while ensuring page or chunk parallelism is a subject of this invention.
In addition to parallel processing, there are various other reasons for page independence to be valuable. A document manager may be called upon to reverse the order of the pages of a document prior to printing on a printer that prints pages face up. A user may wish to reprint only a portion of a long document, possibly due to an error in the original printing process or subsequent processing. In this case the document manager would be called upon to extract a sub-document containing the desired pages from the entire document before it is converted to print-ready form. In either of these cases the document manager must construct a valid document that will, when converted to print-ready form, produce the same set of pages as would have been produced had the entire document been physically printed and then either mechanically reversed (in the first case) or the desired pages extracted from the larger set of (physical) pages. When the content of a given page depends on the content of a previous page, this is not possible using prior art techniques. In this case, page independence has been violated. When the content of any given page does not depend in any way on the previous pages processed, the document is page independent.
Document Structuring Conventions (xe2x80x9cDSCxe2x80x9d) conformant PostScript(copyright) is one system making page independent processing available; however, there are exceptions in this convention so that page independence cannot always be guaranteed.
In order for a PostScript master to be conformant it must obey the grammar specified by Adobe""s report #5001, PostScript Language Document Structuring Conventions Specification, available from Adobe""s developer support web site. While many PostScript masters violate the rules, there still is a substantial amount of conformant documents. Several reasons exist for this conclusion. First, the conventions are now approaching ten years old, which has given applications and driver writers time to modify their software, and for pre-DSC software to have fallen out of use. Second, the PostScript masters of interest are all automatically produced by a small set of applications (or an even more limited set of drivers called by other applications). If these applications ever fail to produce conformant documents it is in a very limited set of ways. Experience supports this view: the majority of applications appear to produce conformant PostScript, while the exceptions appear to break in predictable ways.
Document management systems are sometimes called upon to perform such tasks as job subsetting and page re-ordering (typically page reversal). The requirements of such a system are much like those for a splitter, which divides the job into independent pages or groups of pages: each group when printed must print correctly despite having been removed from the environment of the job in which it originated.
Accordingly there is a need for a system which is not limited to manipulating the pages in perfectly conformant documents, but can handle documents that are close to conformance, breaking the rules in predictable ways. Such a system is a subject of this invention.
The conventions describe material contained in specially formatted comments, which means that a PostScript document need not conform in order to print correctly. Certain print services depend on conformance, which supplies the motivation for applications writers to conform. A DSC-conformant document begins with the comment xe2x80x9c%!PS-Adobe-3.0 less than type greater than optxe2x80x9d where the type indicates whether it is a regular file, an encapsulated PostScript file (EPSF), or of type Query, ExitServer or Resource. For the present invention, interest primarily rests in regular files, for which a type is not supplied, and EPSF, when it occurs as a sub-document in a regular file. A document manager (which could be a splitter) is expected to assume that a document is conformant if it begins with this comment. Experience has shown that files with version 2.1 are equally likely to be page independent.
The conventions describe a document as containing a prolog and a script, the prolog containing material that must be copied to the beginning of every sub-document when a document is split, and the script containing a small amount that also must be copied, followed by the independent page material. It begins with a xe2x80x9c%%BeginSettup:xe2x80x9d comment, and ends with an xe2x80x9c%%EndSetupxe2x80x9d comment, which should be followed immediately by the first xe2x80x9c%%Page:  less than label greater than #xe2x80x9d comment.
The content for a page normally begins with a xe2x80x9c%%Page:xe2x80x9d comment, and ends with a xe2x80x9c%%PageTrailerxe2x80x9d comment, although the xe2x80x9c%%PageTrailerxe2x80x9d comment is optional.
The convention specifications clearly indicate that only one %%EOF should appear in a document, and that a document manager should take the first occurrence as indicating end of file. However, PageMaker(trademark) has been known to combine multiple documents by appending them (including the %% EOF) into one file. This is one example of an error in conformance that is easily recognized and fixed.
Besides the comment structuring conventions, the creator should put all the PostScript material needed on all pages before the first xe2x80x9c%%Page:xe2x80x9d comment, with the caveat that a creator is allowed to signal a failure to do so with a xe2x80x9c%%PageOrder: Specialxe2x80x9d comment. If a document manager sees this comment, it is normally expected to assume the document is not page independent. However, at least one application always uses that sequence, effectively disabling any document management features that require page independence. It is the goal of this invention to allow a document manager to ignore the xe2x80x9c%%PageOrder: Specialxe2x80x9d, (for known applications) without generating incorrect output.
Accordingly, there is a need for a system or method to identify preselected tokens or idioms which are known to preclude independent handling of selected portions of the print job. The print job then needs to be adjusted to facilitate its splitting with minimal adjustment of the print job itself. The subject invention satisfies these needs and thus overcomes the problems specified above, as well as others.
In a nearly page-independent document print job, such as is typically generated by modern applications and drivers, there is enough information in the header material of the files of the print job to identify the creator. For those creators known to generate incorrect files or files that would be out of page independent conformance due to the inclusion of certain predetermined idioms or tokens, a search is made for those idioms in the files that cause the processing of the files to fail when split into segregated pages. Corrective action is implemented while splitting the files into pages or chunks so that the files may be safely reordered, interpreted and/or printed in parallel, subsetted, or treated in any other way that requires page or chunk independence. Implementation of the subject invention facilitates page parallel RIP (Rasterizing Image Processing), as well as other applications including page reversal before RIP, subset RIP and print, and page parallel print on multiple printers.
The subject invention comprises a unique implementation of parallelism for which we can find no satisfactory defined term, and thus functioning as our own lexicographer, we will referto this concept as xe2x80x9cchunkxe2x80x9d parallelism. Chunk parallelism is an intermediate level of parallelism between job parallelism and page parallelism. A chunk is a collection of rasterized data consisting of at least one page and not more than one job. A chunk may be an integer number of pages less than an entire job but has a rasterizing overhead occurring on a chunk basis as opposed to a per page basis.
The printing system of the subject invention comprises a printer; a plurality of processing nodes, each processing node being disposed for processing a portion of a print job into a printer dependent format; and a processing manager for splitting the print job into segregated portions for independent processing by the processing nodes into the printer dependent format. The processing manager includes means for identifying selected idioms within the print job known to preclude splitting of the print job into a plurality of the portions for independent processing. The processing manager adds the selected identified idioms or portions of the print job associated with the idioms that manipulate the print job, to the segregated portions during the splitting to enable the successful processing. The idioms are attached to a header of the print job and prefixed to each of the segregated portions.
In accordance with another aspect of the present invention, a method is provided for splitting a nearly-page independent print job into a plurality of job chunks for independent parallel processing by a plurality of processing nodes. The method comprises searching the print job for predetermined idioms known to preclude the successful independent processing of the chunks. Idioms are saved in the header portion of the print job. The job is split into the job chunks and the idioms are added to the job chunks to enable their successful independent processing. The adding preferably comprises prefixing the header to the job chunks.
A first particular advantage of the subject invention is parallel RIP node processing functionality when the print job is not page guaranteed.
The second advantage is print job splitting so that the files of the print job may be safely reordered, interpreted and/or printed in parallel, subsetted or treated in any other way that requires page independence. Such splitting particularly enables page parallel RIP as well as page reversal before RIP, subset RIP and print, and page parallel print on multiple printers.
Other advantages and benefits of the present invention will become apparent to those of ordinary skill in the art upon a reading and understanding of the following detailed description of the preferred embodiments.