The present invention is directed to a system and method for preparing and distributing an electronic version of a printed document that preserves the appearance of the printed pages and is further directed to a persistent electronic storage medium on which such an electronic version is distributed. In particular, the present invention is directed to an electronic Yellow Pages viewer and to an electronic billing system including a tear sheet, in which the appearance of a printed page of a Yellow Pages directory is preserved.
Telephone companies have long distributed Yellow Pages directories in printed and bound form, typically annually. Such directories are typically distributed free of charge, with revenue coming from the sale of advertisements. Some director, advertising is also sold for White Pages directories.
Each advertisement in a Yellow Pages directory can include only text or both text and graphics; the advertisements can vary in size from a few lines to a full page. The process of laying out the directory includes assigning each advertisement to a page and to a position on that page according to techniques such as those disclosed in U.S. Pat. No. 5,390,354 to de Heus et al. Such techniques generally have a goal of minimizing wasted space, which is normally not possible if the advertisements are simply arranged in a linear order and “poured” into each column.
Once the directory is laid out, printing data are generated to allow a printer to print the page. Typically, the printing data are in the Adobe PostScript page description language, and the graphics on each page are in encapsulated PostScript (EPS) format. PostScript is not optimized for file size; in fact, the printing data for a single page typically consume several megabytes, with the size varying with such factors as the complexity of the layout, the number of graphical elements, and the complexity of each graphical element.
It is expensive to print and distribute Yellow Pages directories to every telephone customer. In large organizations, the directories are easily mislaid. They also have to be recycled or otherwise disposed of.
To overcome those problems, various companies have provided electronic Yellow Pages directories, typically accessible over the Internet One example is BigYellowSM, published by Bell Atlantic Electronic Commerce Services, Inc. A user accesses the directory through its home page, which includes a search form with text boxes to allow the user to search by any or all of the category, the business name, the city and the state. When the user enters a search, a CGI script searches a database, generates an HTML page of hits, and returns that HTML page to the user.
Directories of that type can be accessed from any computer that can connect to the Internet and that can run a Web browser. However, such directories present an interface that is unfamiliar to many users, in that the interface bears no resemblance to a traditional bound Yellow Pages directory.
Simply providing users with the printer data would not be practical for several reasons. The size of the printer data makes distribution of the printer data burdensome on media such as CD-ROM's and out of the question over the Internet. Not all users are equipped to handle PostScript files. A desired page or range of pages would still have to be manually located and printed or otherwise imaged.
Similar issues present themselves in billing. An advertiser in a traditional bound Yellow Pages directory receives a bill that includes a tear sheet, which is the sheet from the directory on which that advertiser's entry appears. The tear sheet, to be of any use, must faithfully reproduce both the content and the layout of what will be printed. No satisfactory electronic replacement for the hard-copy tear sheet is known in the art. Without such an electronic replacement, the advantages of electronic billing, such as automated reconciliation of billing statements, are beyond reach. Also, while it would be useful to provide each advertiser with a tear sheet on which that advertiser's entry was highlighted, such highlighting on hard-copy tear sheets is impractical.
In a different field of endeavor, it is known to store bitmapped representations of the pages of printed documents in combination with an indexing scheme for accessing them. For example, U.S. Pat. No. 5,623,681 to Rivette et al teaches a method and apparatus in which documents such as patents are stored in both text and image formats on a CD-ROM or the like. The text files are ASCII text representations of the documents, while the image files are bitmap files produced by scanning hard copies. The text and image files are analyzed to produce an “equivalent file” that formats the text with the same line numbers, line breaks, column numbers and column breaks as in the images. The equivalent file is then indexed. A user can display the equivalent file and the image file in side-by-side relationship with synchronization between the views so that the same portion of the document is displayed in both formats.
The use of both text and bitmap representations of the pages allows easy access to a faithful representation of each page. However, the user must install special viewing software. Therefore, the publisher must provide such viewing software for as many operating systems as the relevant market requires. Also, the printing data used to generate each page are not readily available. Instead, a hard copy of each page must be scanned in to create the bitmap image, and the formatting information must be reconstructed from that bitmap image through OCR.