1. Technical Field
The present invention relates to an electronic document generation system for generating an electronic document and a technique related thereto.
2. Background Art
Techniques for scanning an original document to generate an electronic document are known for use in an image forming apparatus such as a Multi-Functional Peripheral (MFP).
Examples of the techniques include a technique for generating an electronic document with text data, which will be described below (see JP 2012-73749A, for example), in addition to a technique for directly obtaining a scanned image of an original document to generate an electronic document. More specifically, a scanned image (in particular, an image representing characters) of an original document is subjected to optical character recognition processing (hereinafter, also referred to as “OCR processing”) so that text data of characters in the scanned image is automatically recognized and overlaid with and embedded in the scanned image without being displayed. This produces an electronic document in a predetermined format (electronic document with text data) known as a PDF (portable document format) document (or searchable PDF document) with invisible text, for example.
Incidentally, there are techniques for using a cloud server to provide services related to various types of application software. As one example of such cloud services (application services), there is known to be a technique for providing general-purpose OCR processing services. However, the general-purpose OCR processing services provide only fundamental functions (only OCR processing) and do not generate an electronic document with text data. Thus, final processing for generating an electronic document with text data needs to be performed on the client device side.
In the case of using a general-purpose OCR processing service, for example, a scanned image is first transmitted from a client device (specifically, an application that is being executed by the client device) to a cloud server (specifically, another application that is being executed by the cloud server). Then, the cloud server executes OCR processing on the entire scanned image and returns the result of the processing to the client device. The client device embeds the OCR processing result received from the cloud server into the original scanned image to generate an electronic document with text data (e.g., a searchable PDF document (PDF document with invisible text)). Using such a general-purpose OCR processing service allows the OCR processing to be performed by the other device (cloud server) different from the client device (e.g., an image forming apparatus or an apparatus for generating a scanned image) that has requested the execution of the OCR processing. It is thus possible to reduce the processing load on the client device.
However, in the case where the client device gives an instruction to perform OCR processing on the scanned image to the other device (cloud server) and uses the OCR processing result to generate an electronic document with text data as mentioned above, a problem may occur in which a character image in the scanned image and text data are shifted from each other when arranged. For example, when a character string of the OCR processing result (text data) and a character string (character string serving as a character image) of the scanned image have different character sizes and are arranged in the same page, the character string of the OCR processing result is arranged at a quite different position in the direction of arrangement from the character string serving as the character image in the scanned image. To be more specific, although the two character strings may be at first arranged at the same position, the amount of shift in position between these character strings will become evident especially at the ends of the character strings in the arrangement direction.
Such a problem occurs due to the fact that the cloud server returns only the OCR processing result (the result of character string recognition) to the client device and does not return the sizes of recognized characters. This problem can become evident in particular when the application on the cloud server side takes a substantially fixed form of outputting processing results (e.g., when it is not possible for an electronic document generation application on the client side to arbitrarily set its output form).