Recently, advancements in hardware and software have stimulated the release of electronic publications as a new form of publication replacing existing paper media. Electronic publications are publications whose contents have been turned into electronic data, and are stored in a storage medium or storage device. Electronic publications can also incorporate so-called multimedia-type data, including, for example, voice, still pictures, moving pictures, and animation. Currently, the contents, or data, of electronic publications are for the most part in a text-based format, and therefore are made up primarily of text data using character codes.
Currently, about 500,000 works are released each year as paper publications using a paper medium, that is, in the format of so-called “books.” The total number of published paper publications is enormous. Of the works released as paper publications, only an extremely small number of works are released as electronic publications as well, and most works are released only in form of a paper publication.
Conventionally, when creating electronic publications of works already released as paper publications, data showing the text printed on each page of the paper publication was created either manually or using OCR (optical character recognition). Thus, creating the contents of an electronic publication required a large amount of time, so the timely release of large volumes of electronic publications to the market was difficult. Also, it is difficult to turn paper publications like comic books and photo-journals into data contents, as a most of these paper publications is made up of objects other than text, such as illustrations. In light of these circumstances, conventionally the number of electronic publications published has been about several hundred titles, which is less than the number of paper publications published. Furthermore, conventionally published electronic publications have tended to be reference materials. At the moment, the circulation of electronic publications does not even reach 1% of the circulation of paper publications. In particular, the problem of the lack of diversity in contents has become a significant obstacle in the circulation of electronic publications.
To solve the above-mentioned problems, it seems to be possible to put the contents of an electronic publication into an image-based format. Image-based contents are made from the data of images of the contents of a work. To create image data for the image-based contents of existing paper publications, it is sufficient to read in each page of the existing paper publication with by a scanner. Thus, a large number of electronic publications can be supplied to the market in a short period of time. When the contents are put into an image-based format, it is possible to release those titles that were difficult to process as text-based contents, such as comic books and photo-journals, to the market as electronic publications. When the contents are put into an image-based format, text that includes characters that do not match current character code systems, such as those using foreign characters or variant Chinese characters, or old manuscripts, for example, can be easily turned into electronic contents. When the contents are put into an image-based format, the overseas expansion and circulation of viewers and authoring systems for electronic publications is easy, because the electronic contents do not depend on language or character code. Thus, image-based contents solve all the problems associated with text-based contents.
A variety of processes are performed when creating image-based contents from paper publications, including reading in each page of a paper publication with a scanner equipped with an ADF (auto document feeder), and processing the image data obtained by the scanner into a manuscript structure. An outer border occurs at the edge of the page in the image obtained by reading in the page. The outer border within the image stands out, and gives an unpleasant feeling to the reader. When the image is displayed in a viewer equipped with a CRT (cathode ray tube) or a liquid crystal display device, the edge portion of the CRT or the liquid crystal display device forms a reference line when viewed, so the outer border within the image gives further discomfort to the reader. Based on these reasons, when creating the image-based contents of a paper publication, corrections must be performed to eliminate the outer border from the image of each page. Manually performing the corrections for erasing outer borders requires a significant amount of work, and thus increases the time required in creating electronic contents.
Japanese Unexamined Patent Publication JP-A 5-199398 (1993) discloses an image processing device for erasing outer borders when printing using a storage device for the negative-positive reversal of microfilm with negative images. The image processing device scans portions of the microfilm with negative images, and based on the image signal obtained from the results of that scan, the border between the negative image and the portion around the negative image is detected, and the portion within the image signal outside the detected border is converted to a predetermined value.
The image processing device of JP-A 5-199398 presupposes that images will be processed one at a time. To create electronic contents, the border of a large number of images must be erased, so if the process for erasing the border of each image were to be performed individually, there would be a large increase in the time required for erasing the borders of all the images. Thus, a border eliminating process using the image processing device disclosed in JP-A 5-199398 is not suited for border eliminating when creating electronic contents.
The large number of data processed images that make up the electronic contents have a regular arrangement of characters and illustrations. Thus, if the borders of a large number of images are erased individually, then the part within the image that is to have its border erased shifts depending on the image. Therefore, after individually erasing the borders of a large number of images that make up the electronic contents, it becomes unpleasant to look at the images when these numerous images are viewed in succession. For these reasons, when creating electronic contents from paper publications it is difficult to individually erase the borders of a large number of images.
An object of the present invention is to provide a border eliminating device and method, wherein unnecessary outer borders can be accurately removed from the images of a plurality of pages of a paper publication, and an authoring device which uses this border eliminating device.