Whenever information is electronically encoded as original, or clean, data, and then transferred from the data source to a data destination, noise may be introduced by the transfer process, resulting in alteration of the original, clean data and reception by the data destination as noisy data. For example, when information is electronically encoded as a sequence of binary bits and sent through a communications network, such as a local Ethernet, to a destination node, there is a small probability that any given bit within the original, or clean, sequence of binary bits ends up being corrupted during transfer through the Ethernet, resulting in a “0” bit in the clean data being altered to a “1” bit in the noisy data received at the destination node, or a “1” bit in the clean data altered to a “0” bit in the noisy data received at the destination node. Although electronic communications media are classic examples of noisy channels, almost any type of data transfer or storage may result in data corruption, and may be modeled as a noisy channel. For example, there is a small probability associated with each bit of a block of binary data that the bit will be altered when the block of data is stored and then retrieved from a hard disk, or even when the block of data is transferred from local cache memory to global random-access memory within a computer system. In general, redundant data, including check sums and cyclical redundancy codes, are embedded into data encodings to allow noise-corrupted data to be detected and repaired. However, the amount of redundant data needed, and the accompanying costs and inefficiencies associated with redundant data, grows as the level of undetectable and/or unrepairable data corruption decreases.
In many cases, data corruption may occur prior to a point in a process at which redundant information can be embedded into a data signal to facilitate error detection and correction. As one example, a scanner that optically scans a printed document to produce a digital, electronic encoding of an image of the document can be viewed as a noisy channel in which discrepancies between the digitally encoded image of the document and the original document may arise. Such discrepancies may be introduced by a variety of optical and electronic components within the scanner that focus an optical image of the document onto a light-detecting component that transforms the detected optical image into an electronically encoded image of the document. When the digitally encoded image of the document is displayed or printed, different types of noise may be perceived as graininess, irregularities along the edges of text characters or objects within graphical images, uneven shading or coloration, random speckling, or other such visually distinguishable differences between the printed or displayed version of the digitally encoded data and the original document.
Denoising techniques can be applied to a noisy, digitally encoded image in order to produce a denoised, digitally encoded image that more accurately represents the original document that was scanned to produce the noisy, digitally encoded image. Recently, a discrete universal denoiser method (“DUDE”) has been developed for denoising the noisy output signal of a discrete, memory-less data-transmission channel without relying on knowledge of, or assumptions concerning, the statistical properties of the original, or clean, signal input to the discrete, memory-less channel. Even more recently, the DUDE method has been extended for denoising continuous tone images, such as scanned documents or images. The extended DUDE method is referred to as the “DUDE-CTI method,” or simply as the “DUDE-CTI.” The DUDE-CTI is intended for use in a variety of image and data scanning, processing, and transfer applications.
One problem that is encountered in scanning printed documents is that, for a variety of reasons, information printed on the opposite side of a page from the side of the page being scanned may physically or optically bleed through the document substrate and interfere with information printed on the side of the document being scanned. In such cases, the resulting digital representation of the scanned side of the document may include noise introduced by the scanning process, such as Gaussian-like and salt-and-pepper noise, but also a generally attenuated, mirrored image of the information printed on the opposite side of the document, or, in other words, bleed-through noise. This type of distortion, when optically induced, is referred to as “show-through” noise. The term “bleed-through” is used to collectively refer to both physically and optically induced partial appearance of the reverse side of the document in the scanned image.
Bleed-through noise is a well-recognized problem in scanning, and has been addressed by a number of different techniques based on information theory and mathematics. The bleed-through-noise problem has been addressed, with varying levels of success, using special-purpose methods and components that may add cost and complexity to scanning and copying devices, and that may interfere with other denoising techniques used to address Gaussian-like, salt-and-pepper, and other types of noise encountered in optical scanning processes. Therefore, information-theory researchers, denoiser-method developers, and manufacturers and users of a variety of devices and systems that employ optical scanning of printed documents, such as scanners and copiers, have all recognized the need for more effective, general denoising techniques and systems that can, in integrated fashion, address general types of noise introduced by optical scanning processes as well as the bleed-through noise frequently encountered in optical scanning of printed documents. Similarly, needs for general denoising techniques and systems that can, in integrated fashion, address different types of noise introduced into various types and numbers of signals, resulting in mutually interfering signals, have been recognized by researchers, signal processors, and users of various types of signal processing equipment.