This disclosure is related with the document file generating device and document file generation method that makes data of character image of a manuscript obtained by character recognition, for example, generate document files, such as PDF (Portable Document Format) form.
There is PDF (Portable Document Format) form as a kind of the standard of electronic documents. The font information used for the file (henceforth a PDF file) of PDF form within a document can be embedded at the PFD file itself. Therefore, the PFD file that embedded the font can be drawn using the font embedded in the document as a maker's intention also except the environment that created it (a display or printing).
In PDF, when electrifying a document, in order to stop file size, high compression technology is used. This is identified in a picture layer (image layer) for every object called the character and figure that are contained in a picture, and image process and graphical data compression are made to be performed according to the object contained in each image layer. Thereby, high definition and high compression are attained simultaneously.
By the way, it is drawn by PDF, for example, vectorization of a character image is mentioned as one of the part that makes a character image draw finely.
However, if a character image is vectorized, it will be necessary to indicate drawing process of the vectorized font data on a PDF file, and will become large to text data.
When resolving such a problem, it is possible to apply the documentation method as shown, for example in patent documents 1. Namely, a documentation method has been proposed in patent documents 1, that comprising, inputting, as a table, the threshold value of the font name and the number of points (namely, size) that serve as a standard of whether to embed a font at a document file, a font name and the number of points that are used in it are gained from former data for process, if a font name currently used within former data is registered into a table, comparing the number of points in the former data with the number threshold value of points in a table, if there is a character of a larger point than a threshold value, determining to embed the font.