1. Field of the Invention
The present invention relates to an improved data processing system and, in particular, to a method and system for processing structured documents. Still more particularly, the present invention provides a method and system for pre-print processing of structured documents.
2. Description of Related Art
The desktop publishing industry developed as personal computers became common business tools during the 1980's. Desktop publishing software allowed businesses to create documents with in-house personnel that appeared as if they were produced by professional designers. The availability of laser printers, color printers, and color copiers along with personal computers fostered an environment in which businesses demanded better software and hardware to produce more complex and more colorful publications.
Early in the development of the desktop publishing industry, it was well-known that the presentation space of a computer monitor had different characteristics than the presentation space of a computer printer, which is still currently true. In other words, each physical device has its own resolution, color space, and rendering capabilities.
The creator of a document requires the ability to accurately layout the document on a computer monitor. When attempting to layout a document that will eventually be printed, if one chooses a particular, esthetic, design parameter for a given document based on its appeal on a computer monitor, the parameter may not be able to be reproduced in the same manner on both a computer monitor and a computer printer because the monitor and the print have different presentation capabilities. This contention between display capabilities and printer capabilities has been addressed by desktop publishing software through a variety of solutions, although every solution entails some type of tradeoff between advantages and disadvantages.
To measure the tradeoffs, the concept of WYSIWYG was developed to measure the ability of a document to be laid out in a “What You See Is What You Get” manner, i.e., what is seen on the screen is what will also be seen when printed. Word processing programs can be considered to be a type of desktop publishing application, and most word processing programs currently provide excellent document processing capabilities such that average users of personal computers do not need to worry about WYSIWYG issues.
Recently, Web browsers have become a primary tool for accessing published information. Enterprises and individuals put significant effort into designing Web pages that include graphics, animation, and esthetic layouts, and these Web pages are generally designed to be viewed on a computer screen. Many word processing programs are currently being updated to provide functionality to publish documents in the latest structured document formats, including markup language formats such as HTML (Hypertext Markup Language) and XML (eXtensible Markup Language), so that average users can easily publish documents on the World Wide Web.
Before the development of the World Wide Web, almost all complex publications were designed to be viewed in printed form. Typically, only persons who designed newsletters and other documents with complex layout requirements were concerned with WYSIWYG issues. With the widespread use of the World Wide Web and the growing number of tools to publish documents in an electronic, softcopy form, however, many documents are now being published that are intended to be viewed both in softcopy format on a computer display and in hardcopy format on a computer printer. Hence, many more people have become concerned with WYSIWYG issues, and the contention between display capabilities and printer capabilities has become a much more prominent issue.
In the context of Web browser applications and Web-based documents, the contention between the differences in displaying and printing a Web-based document is currently being solved through a variety of approaches. As one example, an author may create two versions of a Web-based document and then link the two versions together. A first Web page, coded in a markup language, is published within a Web site for general viewing on a computer display by a browser, but the first Web page contains a hyperlink to a second document that has been optimized for printing. The second document, most likely, is also coded in a markup language and is a text-only version of the first Web page, but the second document could possibly be a document in a native format that provides more control of the appearance of a hardcopy version of the document, such as an Adobe® PDF (Portable Document Format) file. This type of solution obviously complicates the publication process because an author potentially must generate two different documents for each Web page.
As another example of controlling different presentations of a document on a display versus a printer, style sheets have been promulgated for separating the style of presentation of a document from the content of a document. Different, media specific styles sheets can be applied against a document such that the document would contain the same content but appear differently when rendered on two different media. In a related manner, media-specific, markup language tags could be used to code different elements within a markup language document such that the elements of the document are rendered differently upon different media. These solutions are rather flexible and could be based on open standards, and the author of a document retains control of the manner in which a document is displayed versus the manner in which it is printed because the control mechanisms are embedded within one or more documents. While it may be important in certain circumstances to ensure that an author retains control of the manner in which a document is presented, a person who is viewing the document has no control over desired changes in the manner in which the document is printed.
As can be seen from the above description, significant efforts have been directed to display-versus-print issues. However, most of these approaches are concerned with providing true WYSIWYG capabilities, i.e., printing a hardcopy of a document that is an accurate approximation of the softcopy of the same document as presented on a computer display. Some of these solutions recognize that some Web-based documents include objects that cannot be replicated in hardcopy versions of the documents, but these solutions still attempt to provide an accurate replication of the softcopy version of a document.
In some circumstances, though, a user does not want an accurate replication and would rather have a close approximation, thereby taking advantage of the fact that a printer can generate a hardcopy of a document that is slightly different from a softcopy of the same document. In very limited circumstances, an application may allow a user to select printing options such that the hardcopy of a document will deviate from the softcopy of the document in specific ways. For example, as shown in FIG. 1A, Netscape Navigator version 4.76 provides dialog box 10 with check boxes 12 and 14 for choosing an option to print all text in black or to print all lines in black, respectively. These options presumably increase the readability of colored text and the legibility of colored lines against color backgrounds, which may be apparent on a computer display but not on a hardcopy printout. Another example, as shown FIG. 1B, Lotus WordPro version 9.5 provides dialog box 20 with check box 22 for choosing an option to print a document without pictures. This option presumably reduces the amount of time required to print a document by skipping the graphics contained within the document.
In addition, printer drivers have long allowed a user to select options that degrade or enhance the quality of a hardcopy. Most printer drivers allow a user to select discrete values within an output quality range, such as “Fast:Normal:Best” or “Draft:Better:Best”, and in response, the printer driver renders an image for delivery to the printing device in accordance with the selected option. In most instances, the printer driver varies the output quality by changing the resolution at which it renders graphics and characters. More importantly, the user generally chooses lower quality output for a temporal advantage because lower quality output generally prints much faster than the best quality output. For example, FIG. 1C shows dialog box 30 for a printer driver that contains radio button 32 for choosing “Faster Printing” as a “Print Quality” option.
An important point to notice about the issue of hardcopy output versus softcopy output is that the proliferation of electronic publishing has not decreased the use of paper and printers. Rather than moving toward a paperless office environment, it appears that the Web has helped to perpetuate the widespread reliance on paper. Most enterprises will continue to provide resources for physical output of documents at least until the power of desktop computers has been significantly enhanced to include more convenient and user friendly input methods and document processing software.
The need for paper versions of softcopy documents has not abated. Meanwhile, solutions to display-versus-print issues have generally revolved around specific issues concerning WYSIWYG problems or concerning the provision of high quality output with both computer monitors and computer printers.
However, as noted above with respect to printer drivers, computer users do not always require high quality hardcopies, particularly when using a Web browser. In fact, many users make hardcopies of Web pages merely for temporary purposes or for approximate record-keeping purposes. For example, given the expansive nature of the World Wide Web, an active Web user might visit many Web sites per day, and it can be difficult to keep track of various Web sites. Although a user may bookmark a Web site, a user might generate a hardcopy version of a Web page as a short-term, physical reminder to revisit the Web page or for some other purpose. In other cases, a user might print out an entire Web page, which could result in the printing of several sheets of paper with many graphic objects, merely to capture a few paragraphs of text for record-keeping purposes or some other purpose.
In addition, some users are conscious of the fact that printing a document may cost on the order of a few cents to several cents per page based on the cost of paper, printer toner, printer ink, printer maintenance, and the original cost of the printer. The cost to print a color document is generally regarded as several cents per average page, and the cost to print a page, whether in color or in solitary black, rises significantly if the page has more content that covers more space on the sheet of paper, thereby requiring significantly more ink or toner. Given that Web-based documents are frequently filled with colorful text and colorful graphics, the cost to print a single page from a Web-based document, even if printed in solitary black, can be much higher than printing a page from a simple text document that has been produced by a word processing program. When these costs are considered over thousands of pages during a calendar year, significant savings could be realized by minimizing printing costs, yet current applications have not addressed the desire of some users to generate hardcopy versions of Web-based documents in a low cost manner.
Referring again to FIG. 1C, dialog box 30 contains checkbox 34 that represents a user-configurable print option, “EconoMode (Save Toner)”, to reduce the consumption of printer toner. While FIG. 1C provides one example of a printer driver that addresses the need to save printer toner, the method in that example, however, merely reduces printer toner across an entire hardcopy without regard to the content of the document that is being printed. Although the reduction of toner across the hardcopy will be uniform, certain sections or objects on the hardcopy may be responsible for most of the consumption of toner or ink. The user that requests the hardcopy may not be interested in printing out some of these sections or objects, yet the user has no control over the inclusion of these sections or objects without editing the softcopy version of the document. While a user might be able to reduce printing costs by editing a document to simplify or reduce its content prior to printing the document, in the case of Web-based documents, browser applications are designed to provide viewing functionality with very limited editing functionality. Moreover, it would be counterproductive to require the user to exert any effort to edit a document if the user's purpose in doing so is the desire to save time and money in printing costs, particularly over a large number of documents.
Therefore, it would be advantageous to provide a methodology that, if requested by a user, automatically modifies a document prior to printing the document, e.g., by reducing content within the document, for the specific purpose of reducing the consumption of physical resources associated with the printing process, thereby reducing printing costs.
In the prior art, applications that request printouts of documents have had both limited abilities and limited purposes for changing the printed version of an electronic document. Typically, the printer driver is given the task of producing a printed version of a document that differs in some ways from the displayed version of the document. In limited circumstances, printer drivers have had the ability to produce a hardcopy in accordance with a print option that requests the conservation of printer toner. Hence, it would seem logical to provide a printer driver with functionality to reduce the content of a document prior to generating a hardcopy of the document.
However, a logical division exists between the duties of a printer driver and an application for which the printer driver prints a document. Printer drivers are concerned with accepting a document, preparing a print job for the document within the presentation space of the printing device, possibly rendering the document within that presentation space, and then transmitting the appropriate information to the printing device. In contrast, applications are concerned with presenting and possibly modifying an electronic version of a document and then requesting the printing of a hardcopy version of the document. It is widely assumed by users and application developers that any changes to the content of a document shall be performed only within the processes of an appropriate application and not within the processes of a printer driver.
Hence, it would not be appropriate to create and deploy a print driver with built-in, content-reduction, functionality. Moreover, a user would probably not desire to automatically reduce the content of printed documents across multiple applications, as would occur with use of a printer driver, because some applications are specifically designed for generating documents in a WYSIWYG manner. As noted above, however, there is a specific need by some users to generate hardcopy versions of Web-based documents in a low cost manner.
Therefore, it would be advantageous to provide a method or system for allowing a user to set printer options such that hardcopies of documents being viewed within a Web-browsing environment are generated in a low cost manner. It would be particularly advantageous to provide pre-print processing of structured documents prior to generating a hardcopy of the document that has been requested by a user of a browser application.