Many people are most comfortable dealing with documents in hardcopy format. In general, hardcopy documents are easier to read, handle, and store than documents kept in the digital domain. No special expertise or computer hardware is needed.
However, in general, manipulating documents in the digital domain is far easier. Text can be indexed, searched upon, reformatted, extracted, and otherwise changed. Stored documents can be easily duplicated, without loss of quality, and transmitted from person to person (for example, via e-mail). And significantly, all of this manipulation can be accomplished without using paper. Moreover, digital copiers and scanners are becoming far more prevalent in both office and home settings.
On the other hand, handling documents in the digital domain typically requires access to a computer system or network. If the user of the computer system does not have a baseline level of expertise or competence in using the system, then productivity can suffer. This consideration can be a serious impediment to the implementation of a “paperless office,” in which digital documents are the prevailing document type.
Accordingly, there is a need to be able to effectively manage documents in the digital domain, as well as to ease the transition from hardcopy documents to digital documents.
Previous attempts to facilitate handling digital documents have often used traditional user-interface paradigms. For example, when a hardcopy document is to be scanned and entered into a document repository, commands to that effect are first entered into a computer terminal or scanning device, which then performs the desired service with the document. A similar sequence of steps is performed when the hardcopy is to be scanned and faxed, scanned and e-mailed, scanned and recognized (via optical character recognition software), or any of numerous other possibilities. Although the entry of commands can be facilitated via user-friendly software or self-explanatory commands, these extra steps are still tedious and may still require a certain level of expertise. Moreover, the sequence of commands entered may be lost once the operation has been performed, and there is a potential for error even with experienced users.
Another possibility is to employ a cover sheet that includes a form for specifying commands. The cover sheet is filled out as the user desires (either by hand-writing commands or by marking check-boxes, for example), and the scanner interprets the commands on the cover sheet and processes the following document accordingly. This approach, too, can be tedious and relatively inefficient, as the approach requires a special-purpose cover sheet to be used for each job. Maintaining a supply of the proper cover sheets can be inconvenient.
Various one- and two-dimensional data codes are known and available to be used to store digital data on hardcopy documents. For example, various types of barcodes (for example, the familiar UPC symbol used as a retail product code) are very well known and are robustly decodable. Other examples of linear barcodes are known as Code 39, Code 128, Interleaved 2 of 5, and Postnet. Two-dimensional codes, such as the PDF417 code and the UPS MaxiCode used by the United Parcel Service to track packages, for example, are becoming more and more widespread.
Self-clocking glyph codes, such as Xerox DataGlyphs, are attractive for embedding machine-readable digital information in images of various types, including ordinary hardcopy documents. These codes have substantial tolerance to image distortion and noise because the digital information they encode is embedded in and fully defined by explicit machine-readable marks, for instance, “glyphs,” a term used herein which is not intended to be limited to Xerox DataGlyphs, but rather is intended to cover all machine-readable marks. These glyphs not only encode the information that is embedded in the code, but also define the sample clock that is employed to extract that information from the code, so they are responsible for the “self-clocking” property of the code as well as the distortion and noise tolerance.
Another known advantage of self-clocking glyph codes is that they ordinarily have an unobtrusive visual appearance, especially codes composed of glyphs that are written on a two-dimensional spatially periodic pattern of centers, such as a regular lattice-like pattern of centers, because the spatial periodicity of the glyphs causes the code to have a more-or-less uniformly textured appearance. For example, logically ordered single bit digital quanta typically is encoded by respective elongated slash-like glyphs which are written on a two-dimensional, spatially periodic pattern of centers in accordance with a predetermined spatial formatting rule, with the individual glyphs being tilted to the left or right of vertical by approximately +45° and −45° for encoding logical “0's” and “1's”, respectively. The mutual orthogonality of the glyph encodings for the two logical states of these single bit digital quanta enhances the discriminability of the code sufficiently to enable the embedded information to be recovered, even when the code pattern is written on a sufficiently fine grain pattern of center to cause the code pattern to have a generally uniform grayscale appearance. However, self-clocking glyph codes can be designed to encode multi-bit digital quanta in the glyphs.
Accordingly, providing a solution that facilitates the use of documents in the digital domain and the transition of documents from hardcopy to digital formats is desirable. Such a solution should be simple, efficient, convenient, and require little or no expertise on the part of the user.