This section provides background information relating to the present disclosure, which is not necessarily prior art.
In some files which have words/pictures written or printed on both sides of one page, words or pictures shown-through from the back side will present in a background region of the front side due to the influence of the content of the back side. The similar phenomenon also exists in the back side, which is referred to as show-through or bleed-through, as shown in FIG. 1A and FIG. 1B. Generally, the show-through in the scanned file image is removed by the following two methods: removing the show-through only based on information of a single-side image; and removing the show-through based on information of two-side image. As compared with the former method, with the latter one, a foreground region of the file image will be obtained based on the information of the two-side image; and the show-through region can be removed accurately since more information is utilized. In the latter method, it is crucial to extract the foreground region of the file image efficiently.
As shown in FIG. 2A and FIG. 2B, a foreground layer of the file image includes two parts: pictures and words. The foreground layer of the image may be extracted using many conventional local/global binarization methods. For example, based on a set color or brightness threshold, image pixels are grouped into foreground pixels and background pixels. However, in these local/global binarization methods, only information on a single pixel or within a local region is utilized. In some picture regions of the file, a color of certain pixels is very close to a color of background pixels, and hence the certain pixels may be grouped as background pixels by mistake, thereby resulting in loss of foreground pixels. Although some of the methods may be used to locate a text line, but pictures in the image have complicated color distribution and irregular shapes, hence it is difficult to extract the foreground layer of the picture region efficiently using these methods. Therefore, the present disclosure will provide a solution for efficiently extracting the foreground region of the file.