Document creation and production (e.g. printing) often involves making changes to a document. The changes may result from iteration in the content creation phase, corrections identified after the content creation phase or requirements of the production phase. Inevitably, different versions of a document result. Persons working with different versions of documents desire tools for identifying differences between versions. In some circumstances, there is a desire to merge some content from one version of a document with other content from another version of the document.
Prior art for accomplishing this is well known. Microsoft® Word® 2003 software includes features capable of identifying differences between documents consisting primarily of textual content. For example, two documents, having some common textual content, can be compared to identify components common and unique to each document. Furthermore, one document can be merged with the other, based on information obtained during the comparison.
In the graphic arts field however, documents often comprise content including combinations of text, photographic images and artwork.
Microsoft® Word® 2003 provides only limited support for comparing and merging non-text elements. For example, a Microsoft® Word® 2003 document containing a combination of text, inserted images and artwork (drawn with the integrated drawing function provided by Microsoft® Word® 2003) can be compared. The comparison does not recognize changes that involve substituting an inserted image file with a file having a different filename corresponding to a modified form of the original image. Similarly, the comparison does not recognize certain changes in the drawn artwork (e.g. changing the dimensions of a drawn rectangle). Other changes in artwork, such as changing the fill color of a drawn rectangle cause the entire drawing frame to be recognized as different during a comparison.
Other document creation software, such as Adobe® FrameMaker® 7.0 exhibits similar behavior. The user documentation for Adobe® FrameMaker® 7.0 indicates that artwork objects placed in an anchored frame, within the text flow of an Adobe® FrameMaker® 7.0 document, are compared. If the objects are different, or if they are in different positions (for example, if they have a different front-to-back order), the entire anchored frame is marked as changed. Experimentation reveals that some changes to objects, such as resizing, are not recognized during a comparison. Similarly, changes to artwork inserted as an encapsulated PostScript® (.eps) file, are not recognized during a comparison.
Document interchange formats can represent documents having mixed content. Some document interchange formats, such as TIFF and CT/LW, normalize content as raster pixels. An advantage of this format is that conversion to a production format is relatively simple, since most display and printing devices are raster-oriented. A disadvantage of this format is that information about the structure of the content is lost during the rendering process that produces raster pixels.
There exists software tools for comparing raster documents. Such tools may compare raster pixels to determine differences. Typically, these differences are displayed visually by highlighting individual pixels in a contrast color or by highlighting a region surrounding any changed pixels. Merging two raster documents can be accomplished by manually selecting pixels from each document. This is not practical where significant differences occur. Automation is also difficult since there is little context information upon which to determine the document to select for each pixel. An example of a tool that compares raster images is Artwork Systems ArtPro™ 6.5, which provides an “export differences” function that operates to compare two jobs. When calculating the differences, ArtPro™ scans the job in pixels, it does not look at vector information.
Other document interchange formats, such as Adobe® PostScript® and Adobe® Portable Document Format (PDF), represent content as vector elements. A document includes page description language statements that define vector-based graphic elements (e.g. text, images and symbol clipping paths). The language describes elements with attributes identifying their characteristics and their layout on a page. The language also describes the order in which each element is to be displayed on a page. In this context, vector format has advantages and disadvantages opposite those of raster format.
Adobe® Acrobat® provides a document comparison function with three levels of analysis detail. Experiments, using PDF files created by printing from modified versions of an Adobe® Illustrator® document suggest that pixel comparison is being performed. For example, comparing with the most detailed level of analysis, Acrobat® can detect a single pixel variation in an imported image. This is highlighted visually as a path surrounding the vicinity of any changed pixels. Similarly, changes made to a PDF file using a PDF editor application (e.g. Enfocus Pitstop™) to increase the size of a path graphic element (e.g. a triangle shape) are detected by Acrobat® and visually highlighted as changes in a small portion of the boundary of the path graphic element. The entire path graphic element is not highlighted as having been changed.
Enfocus Pitstop™ allows a user that is editing graphic elements in a PDF document to identify differences based on session logs that track edits made to graphic elements within that document.
Creo® Seps2Comp™ software examines attributes of graphic elements from multiple pages of a single document. Each page of the document represents a different printing colorant, generated from a composite-color document during the step of creating the document interchange format. Seps2Comp™ examines attributes of graphic elements to infer the composite graphic element based on similarity between attributes of the color-separated graphic elements. Similar elements from separate pages can be composited by combining their colorants and tonalities from separate pages into a single graphic element on a single page. Seps2Comp™ only operates in an automated fashion. In some situations, it can inappropriately declare graphic elements as being similar or different. The algorithms and rules for determining similarity are not ideal and no method for compensating for mistakes exists.
Thus, there is an unfulfilled need for systems and methods for comparing documents containing a variety of types of elements. Printing of packaging materials is one field where the needs are acute. Two factors exacerbate the acuteness. First, packaging documents are often produced with variations to suit needs of different regions or markets. The variations are usually included in the original native document format and may be manifested as separate layers that can be selectively enabled prior to producing the document interchange format for a specific region or market. Thus, a number of different documents may be printed from each original document. The multiple documents can include a significant number of common graphic elements.
Second, during the print production phase, a packaging converter will invest significant time and skill in preparing a document for printing. This can include trap-processing, which adds graphic elements, at boundaries between graphic elements to improve the quality of the printed material. It can also include halftone screen assignment, which specifies the nature of the rendered pixels, on a graphic element basis, to improve the quality of the printed material. It can also include editing the graphic elements to make corrections in content, such as fixing spelling mistakes. Other print production processing activities can also occur.
Packaging converters, faced with two or more significantly common documents, cannot afford to absorb the significant costs associated with duplicating production activities to account for regional variations and last-minute content changes. Furthermore, the process for producing printing plates is time-consuming and packaging converters require tools for visualizing the differences between documents prior to making plates. Visualizing differences at the graphic element level, instead of the pixel level, is important. In many cases, regional variations or content changes affect only specific plates corresponding to specific colors (usually black and spot colors).
In some scenarios where changes are made to a document, content elements that are unchanged will appear to be changed when compared with the original document. As an example, changes in production content (e.g. marks, scales, job information) may cause the page size and/or position to change. Positions for content elements can be affected by such changes since their position may be specified relative to the page's size and position, which may have changed. However, the printer may only be interested in identifying explicit changes to content and not implicit changes caused by indirect actions.
As an another example, content elements produced from a creative application (e.g. encapsulated PostScript) can specify content position relative to a media box defined by the extent of configured elements. During a content revision that only adds one content element, as an example, the media box change may result in a change in the position of otherwise unchanged content.
Also, a printer may not be interested in having a comparison identify other changes, such as changes that affect content element attributes but without changing the element's appearance. As an example, a change in an element's color space and corresponding color values will result in attribute changes but the different color attributes may represent visually similar colors.
In some scenarios, content display order can be changed intentionally or accidentally. This can result in content attributes remaining unchanged but the visual appearance may change as a result of the change in painting order. For example, an element painting early in the display may be partially obscured by another element painting later. Hence, a change that causes the element to paint later may be worth identifying during a comparison. If the printer's intent is to merge changed content into the reference document it may be desirable to automatically compensate for changes that only affect an element's display order.