1. Field of the Invention
This invention relates generally to interactive document processing in an electronic domain, and particularly to a new modality incorporating hybrid (digital and analog) signal processing for the transmission, storage, and retrieval of documents to optimize informational content and processing time.
2. General Terminology
While the terminology utilized in the detailed description of the preferred embodiments is basic to those of ordinary skill in the art relating to designing equipment of the types discussed herein (i.e., television cameras, optical laser disc recorders, personal computer interfaces, local area networks, data transmission systems), the general terminology relating to interactive document processing operations is uncertain. A clear and uniform delineation of this terminology is therefore considered necessary to a proper understanding of the subject matter of this invention.
A document is any information recorded in a substantial (tangible) or insubstantial (electronic) form structured for human visual comprehension either directly or through the aid of enabling equipment. Documents were at one time considered written or tangible information, and this definition was later broadened to information fixed in any tangible medium such as magnetic tapes or optical disks. Currently, it is necessary to view documents even more broadly, since they may be created and reside exclusively in electronic memory, and may be transformed, presented, or transmitted without ever being reduced to a tangible medium.
Documents are therefore one subset of all the types of informational bundles, with non-documents generally being referred to simply as recordings. Documents and recordings may coexist in the same tangible medium, such as a magnetic video tape having both an audio recording coupled with a sequence of visual frames, or they may exist in related or completely separate and distinct mediums.
The differentiating requirement imposed upon documents is that they be capable of direct or assisted visual comprehension. For example, a page written in braille is normally structured for both visual and tactile comprehension. The fact that it is intended for tactile comprehension does not defeat the fact that it is also structured for visual comprehension. (In this instance, comprehension does not imply recognition or interpretation of the braille characters or their assigned meanings, but merely acknowledgement or appreciation of their existence as having content and structured form in the document which may be visually observed.) Several types of informational bundles that are structured for comprehension only by senses other than visual or auditory do exist or have been postulated, but no uniform or concise body of terminology has been developed to classify or categorize their nature or properties. Furthermore, only recordings and documents have developed as recognized means for information processing and interpersonal communication.
It may be appreciated that where documents are integrated with recordings, the term "recordings" has been adopted as the descriptive or identifying name for both components of the informational bundle predominantly because of the particular nature of the existing technologies and mediums currently employed on a standardized basis. It remains to be seen whether this tendency will change with the proliferation of multimedia processing (that is, processing both documents and recordings in an electronic domain as integrated but separable components of an unitary informational bundle.)
Although documents existed and were used for interpersonal communication long before recordings, the technological improvements in recording over the last century have so overshadowed the improvements in processing tangible documents that the basic recording technologies were applied extensively and uniformly to both tangible and electronic documents. The current methods utilized for processing both substantial and insubstantial documents have therefore remained conceptually static since the introduction of electronic documents, due in large part to the limiting effect caused by the recent revolution in semiconductor memory and the correspondingly complete acceptance of a digital standard controlling both quantitative and qualitative precision for all insubstantial documents.
A document may exist within any one of three domains: tangible (paper, microfilm or fiche, photographic negatives or prints, etc.); electronic image (two dimensional bitmaps or arrays consisting of rows and columns of pixels each having a particular informational depth describing a monochromatic, grayscale, or color value); and electronic content (alphanumeric data strings, formatted text, spreadsheets, machine-readable programs, etc.)
There have traditionally been considered six basic operations that may be performed in association with a document: creation, reproduction, transformation, display, communication, and storage and retrieval. These operations may be performed on any document regardless of the domain in which it resides. The commonly accepted definitions for each these operations were distilled and articulated concurrently with the development of the existing technologies for converting tangible documents to electronic images, and for converting portions of electronic images to electronic content.
As may be seen by the following discussion, recent developments in the field of document processing (and particularly the technology disclosed in this application) necessitate expanding this list and refining the definitions of those operations. An appreciation of the interrelationships among and distinctions between these operations and their intended definitions is therefore a prerequisite to understanding the modality disclosed herein.
The definitions of "creation" and "reproduction" as used herein differ only slightly from the traditional protocol.
Creation is the initial authoring and fixing of a document in a specific medium. Creation can then be said to comprise the interrelated steps of composition and recordation in which the document is given content and form, both of which may be dependent on human perception and physical limitations of the medium.
Reproduction is the recording of a document's current content and form on multiple instances of the document's current medium.
At this point it is necessary to diverge more significantly from the existing protocol to interpose distinct and broader operational terms. As discussed subsequently in greater detail, current technology has produced expectations regarding operational precision in document processing that have focussed principally on the qualitative precision in reproduction of tangible documents and electronic images and the quantitative precision in transmission, storage, and retrieval of electronic content. These expectations are no longer valid, particularly when discussing the qualitative precision associated with document processing in the electronic image domain, and to be accurate the terms must therefore reflect the fact that certain processes affect either the content or form of a document to the degree that those processes must be reclassified as different operations.
For example, photocopying has traditionally been thought of as reproduction since it is the production of a "duplicate" of an original image on a similar tangible medium. This duplicate image retains sufficient qualitative precision in both content and form that it may be utilized for interpersonal communication in place of the original for many legal or business practices. However, current black-and-white and color photocopying processes result in the loss of such a substantial amount of information in some situations that it constitutes a transformation or material alteration in the basic content of the original document compared with the levels of qualitative precision established by the technology disclosed herein.
As such, for purposes of this discussion photocopying is actually the creation of a distinct derivative document based upon the form and content of the original. In some instances, due to the nature of the original document and the photocopying technique employed, the derivative document will retain sufficient information and qualitative precision to constitute a reproduction. However, a photocopy will not in all cases be a reproduction.
Representation is the recording of a document's current content and form on a different or distinct medium, or recording a portion of the current content and form on the same or different medium, which results in a change in the informational composition of the document. Representation may therefore be thought of as involving some intermediate transition in the domain, content, or form of a document. Representation will frequently entail the transition between two domains, such as the printing of an electronic image on a tangible medium or rendering the image as electronic content, but as with reproduction the original document may continue to exist and reside in the same domain.
Because documents are more frequently being created in the electronic image and electronic content domains, it is important to remember that many people incorrectly regard the initial tangible representation of a document as the original. Those representations are in fact only derivative documents which do not possess the same informational composition as the original, and many representations of an original document result in transitions or transformations that are unintentional from the viewpoint of the operator, but which are intentional and necessary from the viewpoint of the designer of the technology being utilized to produce the representation.
Thus, any intermediate transition between domains or mediums is assumed to encompass some form of a representation, unless the particular nature and character of the original document and the processes used for the transition are sufficiently precise and compatible for the representation to be considered a reproduction for the given functional purposes being considered.
Transformation is the change in content, form, medium, or domain of a document that produces a new or derivative version which is itself a unique document. Transformation can easily be thought of as the creation of a new document, and conversely the process of creation can be thought of as a series of transformations eventually resulting in a document having a desired informational content or form within a specified medium. In some cases (such as manipulating an electronic image contained in semiconductor memory) the original document ceases to exist and is replaced by the new or derivative document. It should also be remembered that other operations such as representation may inherently produce or require transformations due either to the technology employed or the limitations imposed by transitions between mediums or domains, and those transformations and manipulations are frequently transparent to or not appreciated by the operator. As with representation, transformation involves some intentional transition in the document invoked by the operator or the technology designer. The term manipulation is therefore regarded as being more appropriate to describe transformations that are consciously made, selected, instructed, or invoked directly by the operator to intentionally affect the content or form of the document in a predetermined manner.
Transformation has traditionally included some processes involving a change in medium or domain, however since we must assume for definitional purposes that a transition to a new medium or domain may have a substantial and often deleterious affect on the actual informational content of a document, any process involving the transition between mediums or domains are also necessarily regarded as a transformation that gives rise to either a representation of the document or the creation of a new document.
Presentation is the visual manifestation of a selected portion of the content and form of a document for human comprehension. Presentation would include processes such as displaying an electronic image as a rasterized image on a cathode-ray tube (CRT), as a bitmap image on a liquid crystal display (LCD) or light-emitting diode (LED) display, or the process of projecting a visible image onto a tangible surface using an LCD, LED, or similar device. Presentation would also include other methods of projection, such as refractively projecting a visible image from a tangible document such as film, fiche, slides, negatives, or X-rays.
Transmission is the transportation of a document through space or between remote locations. Transmission is believed to be a more accurate term than communication, since a document may be satisfactorily communicated and comprehended without requiring transmission (such as by displaying or representing the document.)
One could theoretically distinguish communication by defining cognitive boundaries for each individual involved in the processes of document handling (such as the author/creator, editor/operator, interpreter/reader) and treat communication as transporting the informational content of the document between cognitive boundaries. In comparison, transmission is the physical transportation of a document through space without regard or reference to cognitive boundaries. Transmission may then be thought of as a subset of communication, but devoid of any comprehension requirements. To the extent that communication requires comprehension of information at some level, it is strictly speaking not an operation that is performed on a document, but simply the utilization of one or more document processing operations as intermediate steps in the overall process to achieve the result of interpersonal communication.
Storage and Retrieval is the transportation of a document through time in a static state in a manner permitting the selective acquisition of that individual document from its storage medium. The provision that the document be in a static state is an addition to the traditional definition of storage and retrieval, since a document may theoretically be transported through time by holding the document in active memory without being stored and retrieved. Similarly, the provision that the document be selectively acquired from its storage medium is an addition to the traditional definition, distinguishing storage and retrieval from the independent and unrelated operations of recording and replaying.
An electronic image normally exists in active volatile memory, and what is perceived by the operator as the document is actually a display of the document being repeatedly and instantaneously refreshed by information drawn from that memory, and manipulations of the displayed image are inserted into and held in that memory. Alternately, an electronic image may be swapped to a separate portion of memory such as a volatile RAM disk or cache, which simulates the existence of a magnetic or optical storage medium but at higher speeds. The process of exchanging informational bits in active memory, and the intervening holding of those informational bits, is not considered storage and retrieval. Conversely, one could consider the process of writing an electronic image to nonvolatile semiconductor memory such as read-only memory (ROM) to accomplish the storage and retrieval function, however semiconductor memory is rarely (if ever) utilized for the selective acquisition of electronic images.
The fact that storage and retrieval have conventionally been treated within the boundaries of one operation underscores the unique and reciprocal nature of those two processes, wherein storage implies the ability to retrieve selected documents individually and non-sequentially from among a plurality of sequentially indexed documents. The nature of document storage may be contrasted with that for audio recordings, where there is no reciprocal function corresponding to retrieval. Instead, recordings are either reproduced or they are replayed in a manner corresponding to presentation or representation. Thus, for most purposes the "storage and retrieval" function for recordings forms a closed loop through various other operations, and is normally not treated as a reciprocal function or single operation. Electronic images may be processed through very similar closed loops involving other operations than storage and retrieval when those documents are being treated as (or form portions of) recordings, such as in the case of multimedia processing applications, but these closed loops are not deemed to be storage and retrieval.
The execution of one or more of the operations described above is termed "document processing," which is considered to be a subset of "information processing" since information can exist in forms other than documents. There are similar sets of operations applied to processing information in fields other than the three document domains, each operation having a definition which may be unique or peculiar to that field. However, there is frequently some overlap between the terminology used in document processing and other fields of information processing, as well as the informal or nontechnical use of the same terminology, which can result in some inaccuracy and inconsistency.
For example, electronic content is frequently but erroneously referred to as "data" because people equate that term for computer fries with the discrete elements of visual images that they recognize as conveying exact or immutable information. A person recognizes certain elements (such as numbers, letters, or symbols we term "characters") with a degree of precision that corresponds to their appreciation of how computers read data, without reference to the conceptual principles involved with communicating, understanding, or comprehending the content of that information apart from the incremental elements. A person can view a page of text and differentiate discrete elements such as characters, each of which have an assigned meaning to that individual. The informational content of the page is considered relatively precise or exact, since it is presumed that individuals to whom the document is directed will assign the same meanings to each element. For purposes of determining precision or accuracy, we disregard both the fact that the aggregation and interpretation of those discrete elements may convey completely different concepts to each individual, and we dispense with individuals who assign different meanings to those informational elements than do members of the target group. In the same way, people understand computers to read data with exact accuracy independent from the need for recognition or interpretation, and there are basic hierarchical groupings of formats, programs, languages, and machine architectures that define what meanings will be assigned to discrete data elements for certain purposes. This appreciation may change as the general public becomes more aware of and familiar with the technological processes involved in the conversion of electronic images to electronic content through optical character recognition (OCR), the application of artificial intelligence or fuzzy logic to visual object recognition in robotics systems, or the use of electronic photography to store electronic images on optical recording medium in place of film negatives.
Electronic content can more properly be defined as encoded data plus a set of structural linkages. The data or information is encoded so that it may be utilized or operated on directly without human or artificial intelligence being applied to "interpret" or "recognize" specific information embedded within the document. The structural linkages are usually defined as accepted information storage or interchange formats and vary in complexity from simple linear data strings in which numbers are stored in a one-dimensional forward-reading sequence, through formats such as rich text format (RTF) or symbolic link format (SYLK) files representing data and particular visual attributes for displaying or representing that data on a page, to page description languages such as Postscript in which an electronic image is reduced to and expressed exclusively as mathematical formulas, callable subroutines, or vectors representing individual components from which the electronic image can be interpreted and reconstituted for display or representation.
Two distinct standards have developed for judging the threshold for integrity and requisite precision in the transmission, storage, and retrieval of electronic content versus electronic images.
The integrity of electronic content is based on a quantitative threshold, and complete or error-free precision is frequently presumed. Lack of quantitative precision may have varying degrees of impact on the qualitative integrity of electronic content. For example, a single discrete or one-bit error in an RTF file may have only a minor effect on the textual representation of a character, and therefore a negligible effect on the qualitative integrity of that file when displayed or represented as a page; the same numerical error in a SYLK file may disrupt a localized portion of the matrix such as a row or column in a spreadsheet, but leave a large portion of the spreadsheet intact and unaffected from a qualitative standpoint; a one-bit error in a program file could be fatal to the program's operation and therefore completely destroy all qualitative integrity. Consequently, preventing even one-bit errors in the transmission, storage, and retrieval of any electronic content document is assumed to be mission critical because there is an expectation of absolute quantitative precision (and therefore complete qualitative integrity) associated with electronic content.
Conversely, relatively significant quantitative "errors" may be introduced into electronic images and yet be accepted, because they leave the content and form of the electronic image effectively intact without approaching the threshold established for qualitative integrity in the electronic image domain. Again, it should be noted that thresholds for qualitative integrity in the electronic image domain are currently set at artificially low levels because they are being established or judged as a function of the human visual comprehension of presentations (raster or LCD displays) and representations (paper or film printouts) of the corresponding image.
As discussed below, the conventional wisdom in interactive document processing is to capture an image from a source document, and to keep that electronic image exclusively in the digital domain once it has been digitized. It is commonly believed that maintaining the digital nature of the electronic image preserves the accuracy and integrity of its informational content. In fact, this is a mistaken assumption.
First, it must be remembered that any digitization process relies upon an initial transition from the analog to digital domains, and there is an equal probability of introducing errors in informational content at that step as there would be for any other subsequent analog-to-digital conversion. That is, once an image is digitized it is relatively simple to verify that the "data" content does not subsequently change, however there is no assurance that the original "data" accurately reflects the true informational content of the source document. Added to this is the problem that in order to obtain reasonable transmission and storage times for most document processing, images are routinely captured and processed at bit-depths much less than 8-bits grayscale, and often as monochromatic images of very low resolution. Thus, a great deal of informational content is being intentionally discarded for no reason other than to facilitate subsequently digital processing techniques. The quantity of "errors" that might be introduced by repeated analog-to-digital and digital-to-analog conversion of an electronic image still remain many orders of magnitude below the quantity of "errors" that are interjected into images by current document processing technologies.
Second, if the capture device defines the rate limiting operation in the signal conversion process, the technology disclosed herein provides informational integrity equivalent to exclusively digital processes. If the storage device defines the rate limiting operation, and the capture/storage/retrieval/transmission pathway involves four conversions between analog and digital signals, the system has only increased the probability of losing a "pixel" by three times when compared to an exclusively digital process. Since the informational content is several orders of magnitude higher than a conventionally processed digital image, a three times increase in the probability of an error in one "pixel" is insignificant. It should also be remembered that this discussion focusses on the complete loss of a pixel, whereas in an analog frame of reference an error might actually amount only to shifting a pixel to a slightly higher or lower gray level. Isolated errors of this type would be completely imperceptible when imbedded in an image comprising more than a million adjacent pixels, with each pixel having two hundred or more possible discrete grayscale values. The actual effect of an error of this type is further reduced when one considers the potential for applying oversampling and signal averaging techniques for each analog-to-digital conversion.
The application of hybrid processing may thus be equated with a tradeoff between the sanctity of bit-value integrity for the sake of vastly increased processing speeds. However, the loss of bit-value integrity is relatively meaningless when applying this technology, because the techniques for maintaining reasonably high precision in analog-to-digital conversions reduces the frequency at which errors may occur, and any error becomes statistically insignificant if one considers the many-order increase in the magnitude of informational content when using greater bit-depths that effectively dilute any error.
The use of 8-bit grayscale is preferable for most current interactive document processing operations since it provides a displayable image having far superior informational content than conventional techniques for storing tangible source documents (which often treat the original source document as line art and rely on conversion to monochromatic or one-bit levels which discards the majority of the document's actual informational content.) It has been shown that an 8-bit grayscale image is actually more readable than a lower level or monochromatic image corresponding to the same document, due primarily to the wealth of informational content situated in the zone situated above the full content of a lower level image and below the portion of the 8-bit grayscale image that is beyond the limits of visual perception. As such, there is a large portion of a true 8-bit grayscale image that cannot be visually perceived without the aid of enabling equipment, and which would be discarded in normal viewing. The ability to display an electronic image in 8-bit grayscale therefore also permits a wide range of user-definable adjustments in the contrast levels and grayscale filters to be applied to the viewed image that facilitate interactive discrimination or enhancement of the informational content in the image for certain applications (such as diagnostic review of X-rays, MRI scans, or other medical images) without affecting or altering the source document or its stored image.
In addition, the 8-bit grayscale electronic image will contain background details such as security paper patterns, watermarks, illegible color inks, markings that are faded or too faint to be perceived by viewing the original source document, as well as creases, smudges, stains, and other unique identifying details that assure far greater certainty when verifying the integrity and authenticity of the electronic image, far exceeding the currently accepted standards for duplicates of financial, business, and legal records. The use of 8-bit grayscale also permits the capture of electronic images from damaged or aged tangible source documents such as burnt papers or faded microfiche that could not be reproduced by other means, and which may be accomplished as if the original source documents were undamaged.
Because electronic images currently exist as bitmaps or digital arrays in semiconductor memory, they are transmitted, stored, and retrieved in digital form and these operations do not themselves introduce quantitative errors in the content of the electronic images. The most significant "errors" or losses in content generally result from intentional transformations occurring during processing steps such as digitizing a tangible document into an electronic image, or formatting an electronic image for integration into a program or as part of a file storage protocol.
It must be remembered that the acceptable threshold for qualitative integrity of electronic images as used in reference to the technology disclosed herein is several orders of magnitude greater than can be perceived by human vision when viewing a tangible document, and is at or near the limits of what can be practicably achieved by conventional representation devices such as laser printers or film recorders. Consequently, while the technology disclosed herein will normally be implemented in embodiments which provide less than the absolute or complete quantitative precision associated with conventional digital modalities for purely practical reasons, the qualitative integrity of documents created in or converted to the electronic image domain and subsequently transmitted, stored, and retrieved by the disclosed modality will exceed that currently accepted for document processing in the electronic image domain and yet permit substantial decreases in the time required to complete those processes.
The methods disclosed herein may be applied equally to documents in either the electronic image or electronic content domains. However, the use of this modality for processing electronic content will be only be adopted if a sufficient threshold level of quantitative integrity (as defined by the particular application involved) can be consistently maintained or verified. The complexity or expense associated with assuring this requisite level of quantitative integrity for electronic content may be commercially prohibitive given the adequacy of conventional systems now used extensively for digital transmission, storage, and retrieval of electronic content, despite the significant differential in speeds at which the technologies would operate.
Furthermore, at present the vast majority of personal and business communication is conducted using documents that remain almost exclusively in the tangible or electronic image domains, and the immediate need for applying this modality to electronic images far supersedes the comparatively insignificant demand for accelerating the processing of documents in the electronic content domain. Application of this modality to documents created in or converted to the electronic image domain provides an effective precision that is functionally indistinguishable from complete qualitative integrity, and therefore substantially greater than basic levels of qualitative integrity now utilized for tangible and electronic image documents. In addition, the complexity and expense associated with this modality are no more than for existing technologies.
For those reasons, the remainder of this discussion will focus on documents in the tangible and electronic image domains, however it is understood that the modality may be readily applied to the electronic content domain if adapted or augmented to provide suitable assurances that acceptable quantitative precision can be consistently maintained or verified throughout the transmission, storage, and retrieval processes.
Since the focus of this discussion is on particular embodiments designed for the tangible and electronic image domains, it is presumed that any "data" embodied within a tangible or electronic image exists as a function of the document's content and form. Therefore, any data contained in a document is subject to comprehension and recognition by the visual inspection of presentations or representations of the image by humans, who may then manually transcribe or encode that data for use as electronic content, or by the application of artificial intelligence to recognize, interpret, and encode that data from within the image.
There have traditionally been four additional operations associated with transposing or bridging a document from one domain to another: scanning or capture (tangible to electronic image); recognition (electronic image to electronic content); rasterization or bitmapping (electronic content to electronic image); and output or marking (electronic image to tangible.) These operations do not encompass two possible transitions which could occur directly between the tangible and electronic content domains, however virtually all technology now in use relies on some intermediate transition through the electronic image domain.
This terminology can be somewhat misleading. For example, a raster is conventionally defined in electronics as a uniform rectangular pattern of scanning lines having an aspect ratio determined by horizontal and vertical synchronization and timing (or blanking) pulses that is produced on a cathode-ray tube (CRT) in the absence of a modulated signal. In image processing, however, a raster usually means the display of the digital array associated with an electronic image on a raster device such as a monitor, which could as easily be displayed or projected directly as a bitmap using an LCD or LED device. Furthermore, the scanning lines in a raster have no relation to the process of scanning a tangible document in most devices that are called scanners, which conventionally incorporate line- or area-array CCD technology.
It may be readily appreciated that an electronic image exists as a digital array in memory and does not require being displayed as a raster or bitmap, but such a presentation is merely an aid for human visualization and comprehension of the electronic image as it resides in memory. The display is therefore a "virtual" document and the bitmap or digital array is the true or "original" document. The process of transposing a document from the electronic content domain to the electronic image domain really constitutes mapping the image into a digital array in memory, thus the term "bitmapping" has been added to the conventional nomenclature for this transition.
Similarly, transposing a document from the tangible domain to the electronic image domain requires the same mapping of an image into a digital array in active memory, and could just as well be termed "bitmapping." In the field of document processing, where an operator works at a computer or workstation, the term "scanner" has traditionally been applied to a peripheral capture device which creates a digital army or bitmap of a tangible document, and that digital array is simply "dumped" or swapped into a segment of active memory within the computer. Conversely, if the peripheral capture device produces an analog output of sequential frames, the transition to a digital array or bitmap may be performed by the process commonly called "frame grabbing" either by the peripheral device or on board the computer. In this case, the term "capture" is utilized to include both the processes of scanning and frame grabbing where a digital array or bitmap of an electronic image is produced and resides in active memory.
As previously noted, the transition between any two domains almost always results in some transformation of the original document to a new or derivative document, whether or not that transformation is visually perceptible. Similarly, any presentation or representation of an electronic image either produces a virtual image or creates a new tangible document. Theoretically, the virtual image and the new or derivative document should be identified and treated as new documents having different informational content and form than the original document residing in memory as an electronic image.
Many factors affect the degree to which the informational content in a presentation, representation, or transformed image diverges from that of the original document. Because informational content is judged as a function of visual recognition, three factors have become basic to measuring the informational content of an image: kind, depth, and density.
Kind designates the classification of the image, and for purposes of this discussion may be monochromatic, grayscale, or color. These three kinds of images encompass most or all of the visually perceptible documents. At the same time, it should be remembered that there are other types of documents (and certain information within otherwise visible documents) that may only be perceived with some type of enabling technology. Infrared and ultraviolet represent two familiar examples where enabling technology produces images containing informational content that is not otherwise visually perceptible, but it should be remembered that true grayscale also contains large quantities of informational content that may be otherwise discarded in human visualization.
Depth is a digital measure of the quantity or "bits" of information associated with each informational bundle or picture element ("pixel.") The most frequently used depths in interactive document processing are one-bit (effectively monochrome), 4- and 8-bit for grayscale, or 8-, 24-, and 32-bit for color.
Density is the physical spacing of informational bundles in pixels per unit measurement. Density is irrelevant to an electronic image, and only becomes a factor to consider when presenting or representing an image. Density is often interchanged with resolution, and different standards and references have developed for display resolution and printing resolution. However, resolution is really a function of image comprehension as determined in the visual frame of reference. When the term resolution is used in interactive document processing, it is being used as shorthand for the "absolute resolution" which is the minimum separation between pixels or informational bundles that may be distinguished or resolved.
Density and resolution are important terms in document processing because they permit operators to specify an acceptable level for representing or presenting a document. They are also easily confused when comparing representation resolution with presentation resolution. For example, referring to FIG. 8, one might specify outputting an electronic image such as a continuous black-to-white 8-bit grayscale gradient to tangible form on a 300 dots-per-inch (dpi) laser printer for one use, but a much higher resolution for another. These specifications are further complicated by the fact that most tangible output devices are monochromatic, and grayscales are simulated or approximated by applying a selected halftone screen to the image, or printing the image as a dither pattern to approximate levels of grayscale. The halftone screens are usually denoted by the frequency (number of lines per inch) and the angle of orientation. In any case, the true resolution of the output device remains constant while the effective density of the image changes, and the actual informational content of the outputted document decreases compared to the original electronic image. A four bit grayscale gradient would therefore theoretically contain 16 shades of gray including black and white, but if printed at 300 dpi would show only about 12 gray levels due to the type of dithering pattern used and the processor's calculation of the optimal number of steps to create a smooth blend or transition between levels. As such, in the example recited above, the 8-bit grayscale gradient printed at 300 dpi resolution may result in approximately 58 visibly discernable gray levels or less (FIG. 8B) with the dots of the dithering pattern being very apparent, whereas the same gradient printed at a 3360 dpi with a 150 line horizontal screen will produce the same number of gray levels (FIG. 8C) but at a much higher resolution. As such, specifying a higher resolution may achieve a great increase in informational content without increasing the grayscale depth, and increasing the grayscale depth may also greatly increase the informational content without requiting higher resolution. This may be compared with a true 8-bit image viewed on a monitor in which each "pixel" is displayed at one of 256 discrete gray levels, and which has a resolution on the order of 70-80 pixels per inch. It may be appreciated that increasing both resolution and grayscale depth will have a corresponding impact on informational content.
For purposes of creation and transformation in the electronic image domain, an image is usually treated as being composed of bitmaps, objects, paths, models, or renderings. Bitmaps are created and transformed by altering the characteristic value assigned to individual pixels within the bitmap. Objects and paths are transformed by altering either the fundamental definition of the object or path, or a characteristic attribute associated with that object or path. Attributes may be very simple or extremely complex, and attributes of paths may depend upon linkages and relationships to other paths and their attributes. Objects are generally self-contained. Models are the three dimensional equivalent of objects, but are composed of one or more assembled structural blocks. A rendering is essentially a complex bitmap created by applying attributes to a model, but which cannot be interactively transformed by altering the characteristic value of separate pixels.
Objects, paths, models, and many renderings are usually "device density" or "output resolution" dependent, meaning that they are treated within an interactive program as formulaic expressions each having a set depth but variable density, and when presented or represented they will adopt the highest density afforded by the capabilities of the presentation or representation device. For example, a simple arcuate path expressed in Postscript language will have a shape, size, and a specified value within the range dictated by the image's grayscale depth. When output on a 300 dpi printer, the image will have the same basic shape and size as the electronic image, but the printer will utilize its 300 dpi density to provide the best approximation or effective resolution of the grayscale value and path contours defined by the formulaic expression as possible. Output on a 1250 dpi printer will again have the same shape and size but higher density, meaning that the effective resolution of the grayscale level and path contours will more closely match the formulaic expression. Effective resolutions may be so low that losses in informational content are clearly perceptible, or so high that they exceed visible comprehension. The nature and use of the tangible document being represented or the virtual image being presented will dictate the preferred or acceptable effective resolution.
The three measures of the informational content in an electronic image are completely independent of the image's physical size. As a practical matter, limitations imposed by the processor speed, available memory, and storage medium in a document processing system will sometimes require reducing an electronic image's depth or density as its physical size increases. Kind, depth, and density may also be selectively manipulated to achieve a particular visible or perceptible result when an electronic image is transformed, represented, or presented.
Advances in both interactive and non-interactive document processing are evaluated according to five criteria: compatibility, transparency, decentralization, modularity, and operational capacity.
Compatibility refers to the capability of different technologies to utilize the informational content of a document. Compatibility is now limited to storing a document in one or more predefined formats, with interactive document processing programs having the ability to access information only from specific formats. Usually, the higher the level of a program the more formats in which it will store and retrieve a document. If a format or conversion is unavailable, all or a portion of the informational content of the document will be inaccessible. For the most part, compatibility in transmission is limited to modem and network protocols for electronic content and facsimile and network protocols for electronic images.
Transparency has two definitions. At one level, transparency is the movement of documents between domains without loss of informational content. Operational transparency is the ability of an user to employ a technology without conscious consideration of the inherent transformations produced by that technology. In interactive document processing, effective transparency can be defined as the transition from the tangible to electronic image domain without a visually perceptible loss of qualitative integrity, and as equivalent access to electronic image documents through transmission or retrieval processes independent of the location of the original document and without regard to intermediate transformations.
Decentralization refers to the ability of users of communicating systems to perform the same document processing operations on the same documents, and to have the same capabilities that are available to operators at a central document processing or coordination facility.
Modularity refers to the linking of single- or multi-function devices for document processing. Modularity increases the functions performed by devices, or increases the available linkages between devices, to optimize paths through which documents are processed. For example, merging a scanner with a printer to accomplish the functions of scanning, photocopying, facsimile reception, and printing is a higher order of modularity than having four separate devices performing the same four functions. Another facet of modularity is scalability, which permits the addition (or subtraction) of a redundant peripheral device to a system to increase (or decrease) a particular operational capacity of the system without requiting replacement of the complete system. For example, the addition of a SCSI hard drive to a personal computer permits the system to be scaled upwardly to increase its information storage capacity.
Operational capacity for most interactive document processing is determined by measured capabilities or benchmarks such as processor clock speeds (in megahertz), millions of instructions performed per second (MIPS), bit depth of semiconductor memory, access times for semiconductor memory (in nanoseconds), transmission rates (in baud), storage capacity (in megabytes), storage density, and seek and read/write speeds (in milliseconds to microseconds).