Within conventional information processing systems, information may be formatted by an application into a presentation format called a data stream. As utilized herein, a data stream is a collection of structured fields and is defined by the syntax and semantics of the structured fields. The presentation data stream may then be presented on a display, plotter or printer, stored on a storage medium, or electronically forwarded to another computer system where it may be stored, processed or presented by the local applications or devices. In an environment where the presentation data is to be shared by multiple applications and systems, or presented on several, different presentation devices, it is a requirement that resulting presentations look identical regardless of the device or application that processes it. A data stream that supports this requirement is called device independent. The device independent data stream assures that the information will be processable by all receiving applications and devices, with predictable results, regardless of the receiving application or device.
Another requirement for defining shared presentation data streams is to reduce the content to a small, common set. For example, text presentations may be represented in a vector graphics data stream using character string graphics controls. It is not necessary to have both a presentation text data stream and a presentation vector graphics data stream when it is possible to have a single data stream definition which supports both definitions. The value in this combined definition is a reduction in amount of code required by receiving applications and presentation devices for interpreting the presentation data stream. This can reduce development, maintenance and product costs.
Examples of the device independent, reduced content data stream are the International Standard Computer Graphics Metafile (CGM) and the IBM Graphics Object Content Architecture (GOCA). Implementations such as IBM's Presentation Manager and Adobe's PostScript demonstrate that the definition of text and raster image data can be accommodated in a graphical base. An application creates an instance of the data stream using a limited set of available commands. The limitations used are based upon the generating application focus. As pointed out above, a text application may use the graphics data stream controls to represent its output by limiting the set of controls used to those needed to represent the text (i.e., Character String controls).
However, data streams that are device independent and capable of reduced content presentation have disadvantages. These data streams tend to be the least common denominator of the functions supported by all existing applications or devices. Therefore, the function described in the data stream is limited to only those functions supported in common by all applications and devices. This does not support device adaptability, which is the ability to support device-specific performance and functional enhancements. The result is limited through-put, reduced presentation flexibility and quality, or lower over all performance.
Consider the example used earlier of representing presentation text in a vector graphics data stream using character string graphics controls. It is evident that a text-only presentation page may be represented with these graphics controls. However, since it is represented in a graphics data stream, the receiver is not aware that it was produced by a text function. The receiver must process the data stream as graphics. The result will be an accurate representation of the text application's original presentation, however, there may have been a performance overhead associated with processing these controls as graphics. For example, there are many printing devices which have both text and graphics modes. Due to the simplicity in processing the limited text functions as compared to the richer graphical functions, these printers are faster in text mode than in graphics mode. Without knowledge of the generators reference, the appropriate mode can not be selected.
It has been proposed that the receiving application or device interrogate the incoming presentation data stream to determine if certain limitations have been used in the generation of the presentation data stream which may be used to the advantage of the receiving application or device. For example, if a page of presentation data contains only character string controls, text mode may be assumed for the output of that page, thus enhancing performance. This approach could be further extended such that blocks of character string information contained in a graphics representation of a single page may be isolated and these blocks presented in the higher performance text mode. This exploratory approach is known as a heuristic method and exhibits the following disadvantages.
First, text information is not limited to just character data. For example, underscored text characters may be specified using graphical line drawing controls issued with the appropriate relationship to the character string controls. These and other limited combinations of graphical data by generating applications will complicate the heuristic method by requiring knowledge and an ability to handle multiple combinations of controls.
Second, because of the complexities of compound document presentations, there may be cases where the limitation on the use of the graphical data may not be guaranteed, and the results of the heuristic method is not predictable. For example, one portion of a data stream may be limited to character string controls and therefore can be determined to be a text block. A related portion of the same data stream may have character string controls which have been layered over a graphical drawing. Since this latter portion contains character string as well as other graphics controls representing the background drawing, it does not meet the limited definition of the heuristic method and therefore will not be treated as text. This will not result in maximum performance, as some portion of the presentation which could be processed in text mode will not be. Also, differing font rotation and font substitution rules may exist between text and graphics modes in presentation devices further invalidating the limited definition requirements of a heuristic method. This problem is further exacerbated when related character string data may be presented in unrelated font faces or orientations. Therefore, there are valid combinations of graphics controls which will invalidate a heuristic approach. This will cause the heuristic algorithm to be inaccurate or increase the cost of the heuristic implementation such that it is not practical.
Third, the heuristic approach requires that the presentation data stream be pre-processed in order to identify where generator limitations have been applied. This is only possible for devices which have page buffering capability to store the inbound presentation data stream while the heuristic algorithm is processed. Therefore, devices without page buffering can not support the heuristic approach.
In conclusion, the heuristic approach is complex, has questionable accuracy, and is not implementable in all device and application classes. What is needed is a technique by which an originating application can explicitly specify that presentation data, or segments of presentation data, were generated using a specific process. The resulting limited ordering of the presentation data stream contents could be processed by a device specific mode or function without loss of presentation integrity.