1. Field of the Invention
The present invention relates to a data processing apparatus, a method for controlling a data processing apparatus, and a non-transitory computer readable storage medium that are configured to set a document name for electronic document data generated by reading an original.
2. Description of the Related Art
There has been known a technique of performing character recognition processing (hereinafter referred to as “OCR (Optical Character Recognition)”) on electronic document data generated through scanning in a data processing apparatus, such as a digital multifunction peripheral (MFP) or scanner. Also, there has been generally known a technique of setting a character string extracted by performing OCR as a document name of the document data (see Japanese Patent Laid-Open No. 9-134406 which is discussed below).
Furthermore, there has been known a technique of allowing a user to specify the type of language (for example, Japanese, English, etc., hereinafter referred to as a “language”) before performing OCR, and performing OCR using the specified language. By performing OCR using the specified language, character recognition accuracy can be increased in the OCR.
According to another example of the related art, in a case where electronic document data generated through scanning has been sent to a specified destination, the document name of the sent document data may be displayed on a send history screen, together with items such as a sender and the date and time of sending. Such a practice of displaying a document name set for document data on a digital MFP has been generally performed. In the case of displaying characters of a document name or the like on a digital MFP, the characters are normally displayed in a language that has been set using a language setting in an operation unit of the digital MFP.
A document name of document data, the document name being a character string extracted through OCR and set as a document name using a method according to Japanese Patent Laid-Open No. 9-134406, may be displayed on a digital MFP, as described above. An example of such a case is displaying a document name of sent document data on a send history screen. In this case, character codes assigned to characters recognized through OCR in a specified language may not be assigned to the character encoding scheme of the language that has been set in a language setting in an operation unit of the digital MFP.
For example, assuming a case where the language specified before performing OCR is “Japanese” and the language set in the language setting in the operation unit of the digital MFP is “English”. In this case, the codes of characters extracted through OCR with a language of Japanese include codes that are not assigned to the character encoding scheme of English (for example, Windows-1252). Thus, character garbling may occur when a user tries to display a set document name in a language of English on the digital MFP.
Character garbling also occurs when the code of a character recognized in a specified language is assigned to a different character in the character encoding scheme of the language that is set in the language setting in the operation unit of the digital MFP.
An aspect of the present invention is to provide, in view of the above-described problems, a feature for performing appropriate processing when the code of a character recognized in character recognition processing is not assigned to the character encoding scheme of the language that is set in a language setting in an operation unit.