1. Field of the Invention
The present invention relates to an information processing apparatus, a PDL data conversion method, and a storage medium.
2. Description of the Related Art
In recent years, the Print On Demand (hereinafter abbreviated as “POD”) market has been expanding along with an increase in print speed and image quality of electrophotographic and inkjet digital printers. In general, POD is a service for printing electronic data with use of a digital printer, and POD enables performing relatively small-lot print jobs in a shorter turnaround time than when conventional offset printing or the like is used.
With POD, a printing method known as variable data printing (hereinafter abbreviated as “VDP”) is performed, which takes advantage of the characteristic that electronic data is used. A collection of logical information such as page layout, data source, or the like required for one VDP is referred to as a “VDP document”. A VDP document is divided into a fixed portion and a variable portion. An information processing apparatus that performs print processing for a VDP document acquires data for the variable portion from a data source such as an RDB (Relational Database) or a CSV (Comma Separated Values) file. The information processing apparatus associates a column (field) in the data source with a variable portion in a template document, and applies data in that column of the data source row-by-row (record-by-record), thus enabling printing slightly different content each time.
The application of VDP enables creating, for example, direct mail in which product information to be provided is changed in accordance with customer information. Printing by changing product information to be provided in accordance with customer information is performed so that a higher advertising effect than normal printing can be provided.
Here, the physical electronic data of a VDP document is referred to as “VDP data”. The application or the system for creating VDP data is referred to as a “VDP data creation system”. The application or the system for performing interpretation processing for VDP data and outputting VDP data using a digital printing machine is referred to as a “VDP data processing system”.
As VDP data, electronic data described in arbitrary page description language (hereinafter abbreviated as “PDL”) may be employed. However, VDP data described in PDL exclusively for VDP (hereinafter referred to as a “VDP language”) is advantageous in terms of processing efficiency. This is because the VDP language enables the description such that an object of the fixed portion of a VDP document (hereinafter referred to as a “fixed object”) is defined in advance so as to refer to the fixed object later. When the VDP language is subjected to print processing using a VDP data processing system, the VDP data processing system holds the result of interpretation processing for each fixed object, and copies the result of interpretation processing each time the fixed object is referred to. This increases the speed of processing for the entire VDP data.
Among the VDP languages, a PPML (Personalized Print Markup Language) can express the structure of a document using a hierarchical structure. The structure of a document indicates a semantic unit of pages in a document. For example, the structure of a document has a semantic unit such as one record in VDP, the front cover and the text therein, the chapter configuration in the text, or the like.
The expression of the structure of a document by means of PDL is effective for a user of an information processing apparatus for performing image formation processing to make print settings. In general, JDF (Job Definition Format) is often used for making print settings.
Here, assume the case where a user controls printing using PDL by which the structure of a document cannot be expressed. When JDF is used for making print settings, a user makes print settings for each page or a group of pages. For example, when a user wishes to make print settings for “chapter 2” in a document consisting of a plurality of chapters, the user must make print settings for “chapter 2” with knowledge of pages in “chapter 2”.
On the other hand, assume the case where a user controls printing using PDL by which the structure of a document can be expressed. When JDF is used for making print settings, a user can make print settings for a semantic unit by the utilization of the structure of a document. For example, a user can make print settings for a document consisting of a plurality of chapters, such as print settings for printing only pages to which a meaning of “chapter 2” has been given, print settings for using a high quality paper sheet for only a page to which a meaning of “front cover” has been given and using plain paper sheets for other pages.
Even when reprinting is performed according to another print setting, creation of a new JDF using a print setting method using JDF is not troublesome. In other words, in the print setting method using JDF, a user who makes print settings does not need to examine that which page belongs to which chapter one-by-one, this method is very convenient for a user who makes print settings. Specifically, print settings using PDL by which the structure of a document can be expressed can be made at a higher concept than making print settings using PDL by which the structure of a document cannot be expressed, which is convenient for a user.
In the work flow of POD, PDF (Portable Document Format) is generally used as print data. Thus, PDF is also often used for VDP. Also, there is a format called “PDF/X” that facilitates data exchange and printing by imposing various limitations to PDF. PDF/X is also widely used as print data. However, since PDF and PDF/X are not VDP languages, it was impossible to perform high-speed print processing and make print settings using a higher concept.
Accordingly, International Organization for Standard (hereinafter abbreviated as “ISO”) developed PDF/VT, which is PDL to which the function of VDP has been added based on PDF/X, in 2010.
In PDF/VT, the processing performed by a print processing system can be made more efficient by referring to the definition of a rendering object in a document plural times. PDF/VT has a page object structuring function and a metadata setting function. The page object structuring function is realized by a DPart (Document Part) hierarchical structure and the metadata setting function is realized by DPM (DPart Metadata). A group of any key and value can be set in DPM. A meaning can be given to DPart by setting a group of a key and a value in DPM. As a result of which, a user can make print settings for DPart instead of each page under the condition of metadata set in DPM.
For print data or a paper document, the prior art for giving a meaning to a document using the existing PDF function has been proposed. Japanese Patent Laid-Open No. 2004-289357 discloses a method for setting additional information to be added to each component such as images, graphics, text, and the like constituting print data upon conversion of print data into PDF. Also, Japanese Patent Laid-Open No. 2010-109420 discloses an image forming apparatus that sets a chapter dividing mark to the read original document image as wished by a user, and stores the electronic document of the original document added with link information at the chapter diving mark thereof.
However, for print data or a paper document, the prior art for giving a meaning to a document using the existing PDF function only marks a certain position in a document or gives a meaning to an object using the existing PDF function such as “bookmark”, “annotation”, or the like. The PDF function “bookmark” does not indicate a range of pages in a PDF document but indicates an arbitrary position in a PDF document. Also, the PDF function “annotation” is to mainly add a comment or the like to text, and is not intended to structure pages. In other words, the prior art cannot structure pages, and thus, a user cannot make print settings using information to which a meaning has been given by the prior art.
In order to make print settings more flexibly while using a print work flow by means of the already-widely used PDF or PDF/X, using PDF/VT as print data is the most effective solving means. However, flexible print settings cannot be made for data of which the format has already been set to PDF.
In the actual POD work site, PDF data has been widely used so as to print PDF data itself or PDF data using JDF corresponding to PDF data. Thus, it is contemplated that PDF data having a bookmark or PDF data and JDF associated therewith is converted into PDF/VT. In PDF/VT obtained by the conversion, the page structure (logical structure) of a PDL data to be input must be properly expressed by the DPart hierarchical structure of PDF/VT to be output.
However, there has conventionally not been proposed an information processing apparatus that analyzes the logical structure of input PDL data and outputs the input PDL data by converting the input PDL data into PDL data having a hierarchical structure, which properly represents the analyzed logical structure, based on the result of analysis.