The present invention relates to video image processing and more particularly to a method for reducing line width variations in bilevel video images captured at low sampling rates.
Because travel costs are rising and because a traveler's time in transit can seldom be used productively, there is an increasing interest in the use of teleconferencing as an alternative to face to face business meetings between people from different locations. In a typical teleconferencing system, people in different cities or even different countries meet in special teleconferencing rooms at their respective home locations. Each room normally includes a room camera for capturing a wide angle view of the people, a document camera which can be focussed on letters, drawings or other documents, a room monitor for permitting people in one room to see those in the other, and a document monitor for viewing documents being presented in the other room. Communications between the two rooms are established over conventional teleprocessing links, such as leased or switched telephone lines or satellite communication channels.
To reduce communications costs, freeze-frame teleconferencing techniques are often employed. The video image captured by a room camera is updated only periodically, perhaps on the order of once every 30 seconds. People at the receiver see the same "frozen" room image between updates. Audio signals are transmitted on a "real time" basis so that there is no significant delay in voice communications. Document images are updated only when the person presenting a document pushes a "send" button in the teleconferencing room.
After a "send" button is pushed, the image of the presented document does not appear immediately on a display or monitor in the receiving teleconferencing room. A finite period of time is required to scan, capture and process image data at the originating teleconferencing room, to transmit the processed data over teleprocessing links and to process data at the receiving teleconferencing room in order to reconstruct the image of the presented document. The length of the delay can be critical in a teleconferencing system. Delays exceeding a few seconds produce unnatural pauses in the smooth flow of a business meeting.
The length of the delay is generally proportional to the amount of data which must be transmitted in order to construct an acceptable video image and is inversely proportional to the bandwidth of the teleprocessing link over which the data must be transmitted. While the amount of delay can be reduced by using a higher bandwidth channel, the drawback to this approach is that communications costs are a function of required bandwidth. Therefore, it is desirable to use as little bandwidth as possible.
Attempts have been made to minimize delay time and to maintain low communication costs by compressing the amount of data which must be transmitted over a low bandwidth channel in order to reconstruct a video image of a presented document. For example, documents which are normally bilevel (e.g., black printing on white paper) can be digitized by assigning a binary value to each picture element pel captured by the camera scanning the document. Each pel would represent either black or white. The binary data can be encoded using known two-dimensional run length encoding techniques to significantly reduce the amount of data which must be transmitted.
Another technique for minimizing transmission delay and communication costs has been to reduce the scanning resolution at which the original image is scanned and encoded. Instead of scanning at 40 picture elements or pels per inch, the scanning rate may be reduced to 20 or even 10 pels per inch. The amount of data which must be encoded and transmitted is directly related to the scanning rate. Therefore, a lower scanning rate can reduce transmission delays and communication costs.
However, video images captured at reduced sampling rate can become distorted. One type of distortion is unintended variations in line widths. Most business documents presented during teleconferencing sessions consist primarily of typed, printed or hand-lettered text with some graphs or line drawing. In such materials, the lines in the alphanumeric characters and in the graphs usually have the same nominal width. When such a document is scanned at low resolutions, however, certain alignments between the scan pel positions and a line on the image may cause lines of the same nominal width to be represented by different numbers of pels of a certain value. When the image is reconstructed on a video display at a remote teleconferencing site, the resulting variations in line width cause the displayed document to take on an odd, visually irritating appearance. Even though a visually irritating document may still be comprehensible to teleconferencing participants who are viewing it, such a document can still have a negative psychological impact on the smooth conduct and effectiveness of presentations or discussions.
One solution to the problem of line width variations would be, of course, to increase the sampling resolution. However, increased sampling resolution leads to increased transmission delays and increased communications costs which were to be avoided in the first place.