1. Field of the Invention
This invention relates to the field of information display technology. Specifically, this invention is a new and useful method, apparatus, system and computer program product configured to display structured information to a user on a computer controlled display device.
2. Background
Information is often presented in tabular or graphical form to show relationships present in the information. Information provided in tabular form is presented within the structure of rows and columns. An intersection of a row and column defines a cell that can contain information. This row-column format provides a two dimensional characterization of the information stored in each cell. Thus, the rows may be organized to "categorize" the information within the cells, and the columns organized to show the "type" of information contained within the cells. The terms "categorize" and "type" are simply convenient labels to identify different ways of classifying the information. Often the cells in the first row and first column of the table contain supplemental information that describe the category and type of the information stored in the remaining cells (the body cells). A typical table is illustrated in FIG. 1a. A table 101 has a header 103, a leftside 105, and a plurality of body cells 107. The information in the header 103 indicates the categories of the data within the body cells 107. Here the categories are "1990", "1991", and "1992". The information in the leftside 105 cells indicates the type of data within the body cells 107. Here the type of data is related to "California", "New York", and "Indiana". Hence, the value of the "Indiana" data for 1991 is `5` because the value contained in a cell 109 defined by the intersection of the row labeled as "Indiana" and the column labeled as "1991" is 5. The category and type information is supplemental to the primary information stored in the body cells of the table. Thus the information in the table is structured in that the information includes supplemental and primary information.
Graphs are also used to show informational relationships. In this case the information is presented to a user in a graphical form. FIG. 1b illustrates an example graph representing the information contained in FIG. 1a. A graph 121 includes an X-axes 123 and a Y-axis 125. For this graph 121, the Y-axis 125 indicates the magnitude of the values of the datapoints (here corresponding to the data stored in the body cells 107 of the table 101 in FIG. 1a). The X-axis 123 is labeled with the categories of the data. A legend 127 associates each of a plurality of bars 129 with a type of information. Thus the legend 127 shows that the data values for "Indiana" are associated with a plurality of bars 131, one for each category of data. As in the table 101, the information in a graph is structured and contains both supplemental and primary information.
A problem when displaying large tables and graphs is that the information to be displayed often cannot be completely presented (in a useful form) to a user in the drawing area available on a computer display. This problem is often addressed by scrolling and/or paging the information displayed. Paging replaces the currently displayed information with an adjacent "page" of information. Scrolling repetitively and incrementally moves the existing displayed information in a scroll direction and adds new information to the display such that the information appears to move smoothly on the display. Thus, when a table or graph is scrolled or paged to view portions of the information that would not fit in the drawing area of the display, supplemental information (such as a table header or a graphic axis label) is often moved out of the drawing area and out of the user's view. This requires the user to remember what supplemental information is associated with the information that is displayed. This becomes very difficult after a series of scroll or page operations.
When tables are presented on paper, many applications print the supplemental information on each page when the table body extends across pages. This allows the reader to reference the categories and types of the data printed in the table's body cells on each page. Thus, the prior art knows of repeating table headers on each printed page of a multi-page table. This functionality is provided in many "what-you-see-is-what-you-get" (WYSIWYG) applications such as Adobe's Framemaker.RTM. and Microsoft's Word.RTM. (see Using FrameMaker, part number 41-04699-00, page 6-11, from Frame Technology Corporation and Microsoft Word User's Guide, document number WB60460-0794, page 305). In Microsoft Word the supplemental information is first selected by the user and then flagged as a header (a header being classified as supplemental information). In FrameMaker, the table facility provides explicit header, footer, and title areas depending on which table format the user selects. These header, footer and title areas contain information supplemental to the primary information contained in the body cells of the table. These applications only display the supplemental information on a computer display when in a "page view" mode (thus duplicating the formatting of a page of paper on a computer display). The difficulty with this approach is that the supplemental information is presented for each page, thus when multiple pages are presented on the display the supplemental information is duplicated occupying space otherwise available for the display of the primary information. Additionally, the supplemental information changes position on the display as the user scrolls across a page boundary. (For example, as header information at the top of one page is scrolled out of the drawing area a duplicate of the header information may appear at the bottom of the drawing area--at the top of the new page that is scrolled into the display.)
Another approach, taken by Microsoft Excel.RTM., provides a facility for segmenting a displayed spreadsheet and of "freezing" the segments. Thus, the user can manually segment the spreadsheet and freeze the segmented panes such that supplemental information within a frozen pane is always displayed. This approach has only limited applicability when the table is not the entire document. Many documents also contain text and illustrations before, after and around a table. Further, some documents include tables within tables. For these types of general documents there is little utility in maintaining visibility of a single segment.
The above methods also involve an explicit user or page layout command to identify the supplemental information that is to be presented to the user. However for sorting purposes, Microsoft Excel automatically determines the header information (a limited type of supplemental information) of a table. Excel identifies column labels by comparing the characteristics of the information in the first row of data with the information in subsequent rows. This process is briefly described in the Microsoft Office User's Guide, version 5.0, copyright 1993-1994, document no. XL57926-0694, page 388.
Generally, these WYSIWYG applications do not provide a mechanism for displaying supplemental information (such as axis labels, titles, or legends) from a graph. However, graphs share some display characteristics with tables. For example graphs can be larger than the drawing area available to display them. Thus, supplemental information that exists on a graph is often scrolled off the drawing area of a computer driven display.
Although paper based WYSIWYG applications provide a way of displaying supplemental information on tangible media, these applications do not provide a satisfactory solution to the problem of maintaining supplemental information on a limited drawing area of a computer controlled display device. The fundamental concept of a WYSIWYG application is that the author of a document sees a true representation on the computer display of how the document will look when actually printed. Thus the data describing a WYSIWYG document (the data in accordance with a specific document markup language) is designed to create the same image regardless of the device used to present the document (printer and display devices). However, some applications are designed to present information to a user primarily through a computer controlled display device instead of on paper. Unlike paper, that has a limited number of standard sizes, computer displays are available in a wide range of sizes and resolutions. Thus the data used to define a presentation of information on a wide variety of computer displays generally is less constrained than the data used to define a WYSIWYG document. That is, that the information to be displayed is described in a manner appropriate to the device intended to be used to present the information. In the case of computer display devices, the document layout process is often delegated to an application program executing in a computer controlling the user's display. The information provided by an internet server to an internet client often has this characteristic. In particular, the world wide web (WWW) generally uses documents described according to the hypertext markup language (HTML) specification briefly discussed below.
FIG. 2a illustrates a prior art display of a document, containing of a large table, in a limited drawing area. Each of a plurality of screen images 201, 203, and 205 have a user selectable control area used as a scroll control 207. This scroll control 207 is used to position the display of data within the document. In the first screen image 201 a top edge 209 of a table 211 is displayed below a line of text 213 and a horizontal rule 215. The cells that comprise a first row 217 of the table 211 contain supplemental information that categorizes the primary information contained in a plurality of body cells 219 of the table 211. The scroll control 207 of the screen image 201 contains a thumb 221. This thumb 221 allows a user to position the document in the display. The position of the thumb 221 in the scroll control 207 also indicates that the beginning of the document is displayed. In contrast, the position of a thumb 223 in the scroll control 207 on the screen image 203 indicates that the document has been repositioned (scrolled, paged or otherwise). Because the document has been repositioned, the first row of the table (the supplemental information) is no longer presented on the display. Thus, only the primary information contained in the body cells 225 is displayed. Finally, the screen image 205 shows that the position of a thumb 227 in the scroll control 207 indicates the document has been repositioned to its end. The thumbs 221, 223, and 227 are all the same selectable control area but positioned differently to indicate the position of the document in the drawing area.
FIG. 2b also illustrates the prior art method to scroll the context of a document. A screen image 231 contains a first drawing area 233, a second drawing area 235 and a table drawing area 237 containing a table 239 that is partially displayed. To help indicate the position of the table 239 in the drawing area 237, the table 239 contains supplemental information and the letters "A" through "Z". The screen image 231 is only large enough to display the top of the table 239, thus only letters "A" through "G" are displayed. The table 239 is partially displayed because the table drawing area 237 is too small to contain the entire table 239. Thus, the portion of the table 239 that exists below the table drawing area 237 at a bottom boundary 241 is not displayed. The table 239 also includes a supplemental information 243 located at the top of the table. The screen image 231 includes a selectable control area 245 containing a thumb 247 that both allows a user to scroll the display by manipulating the thumb 247 on the selectable control area 245 and indicates the scrolling position of the document by the position of the thumb 247 on the selectable control area 245.
FIGS. 2c and 2d illustrate the appearance of the screen 231 display when presenting different portions of the document. FIG. 2c illustrates the appearance of the screen 231 after the document has been scrolled upwards but not so far as to display the end of the document. Now the drawing areas of FIG. 2b 233, 235, and 237 have been shifted upwards. The table 239 displayed in the drawing area 237 extends above a top boundary 249 and, as in FIG. 2b, extends below the bottom boundary 241. The supplemental information 243 displayed in FIG. 2b is not displayed in FIG. 2c because it has been scrolled above the top boundary 249. The primary data of the table 239 now displays "B" through "J" in the table drawing area 237. The new scroll position is indicated by the position of a thumb 251 on the selectable control area 245. Thumbs 247 and 251 are the same selectable control area, but in different positions in FIGS. 2b and 2c indicating different scroll positions. The scroll operation scrolls both the context containing the table and the table 239.
FIG. 2d illustrates the appearance of the screen 231 after the document is positioned at its end. Now the drawing areas of FIG. 2b 233, and 235 have been shifted completely off the display device. The table drawing area 237 now contains the last portion of the table information (here the "W" through "Z"). The table 239 extends above the top boundary 249. The end of the table (the bottom edge of the table) is at a bottom edge 253. Again the position of a thumb 255 on the selectable control area 245 indicates the current scroll position.
FIGS. 2a-d indicate some of the problems with scrolling a document having tabular information within a context. These problems are, among others, that supplemental information associated with the table can be scrolled off the display; and that information displayed in the context containing the tabular information can also be scrolled off the display. These and similar problems also apply to documents containing a graph.
World Wide Web
The WWW is a massive hypertext system that a user accesses using a WWW browser application executing on a computer--an information access apparatus. The WWW browser apparatus communicates with, and is a client of, information provider apparatus such as server computers each executing server applications capable of communicating with the client browser application. These clients obtain information and services, in the form of web pages, from the server. These web pages are identified by unique universal resource locators (URL) and are usually specified using a markup language--generally a version of the hypertext markup language (HTML) briefly described below.
The background of the WWW, WWW browser applications, and URLs are well described by reference to the first chapter of Instant HTML Web Pages, by Wayne Ause, Ziff-Davis Press, ISBN 1-56276-363-6, copyright 1995, pages 1-15, hereby incorporated by reference as illustrative of the prior art.
FIG. 3 illustrates how a plurality of computers implement a client-server information access system. One skilled in the art will understand that the invention does not depend on the existence of a client-server information access mechanism because information to be displayed to a user often resides on the same computer that accesses the information. An information client system 301 communicates over a network 303 such as the internet 303 to a plurality of information server systems 305. The client system 301 encapsulates requests for services and information within an applicable internet protocol and passes the encapsulated requests to the internet 303 as indicated by an arrow 307. The internet 303 routes these requests to each of the plurality of information server systems 305 addressed within the request as indicated by a plurality of arrows 309. Each of the plurality of addressed information server systems 305 respond to the client system 301 with responses appropriate to the service or information requested by the client system 301. Once the client system 301 receives this information it is presented to the user by using an application program (for example, a WWW browser) executing on the client computer.
HTML is used to describe hypertext documents that can be presented to a user by an application. The application processes HTML data to generate an image that can be displayed to a user on a computer display or tangible page. Unlike page description languages, such as PostScript, the "page" layout of HTML documents is dependent on the drawing area used to display the HTML. Thus, HTML is used to describe hypertext documents that are portable from one computing platform to another and that do not need WYSIWYG functionality. The HTML concept is that of a scrolling page that can be resized as desired by the user. Thus, HTML based applications do not strive to achieve WYSIWYG functionality, but rather they strive to appropriately present information in drawing areas of different sizes and resolutions. Thus, an application that displays HTML data will use whatever drawing area is available to render the HTML to best fit that given drawing area. To perform this function the application will automatically wrap lines, adjust the width and height of table cells and perform other drawing area dependent operations to best display the HTML document in the given drawing area.
HTML 2.0, is described in RFC1866 and can be found on the WWW at: "http://www.cis.ohio-state.edu/htbin/rfcl/rfc1866.html". HTML 2.0 does not provide for tables. However, HTML variants have provided table support. In particular, HTML 3.2, found at "http://www.w3.org/pub/WWW/MarkUP/Wilbur/features.html", RFC1942 found at "http://www.cis.ohio-state.edu/htbin/rfc/rfc1942.html", and the Netscape 1.1 table specification found at "http://home.netscape.com/assist/net.sub.31 sites/tables.html".
HTML versions 1 and 2 did not provide support for tables or graphs (although graphs could be presented as an image). However, HTML table capability was included in version 1.1 of the Netscape browser. The HTML 3.0 proposal, that included extensive table support, was abandoned by Jul. 9, 1996, and discussion initiated regarding HTML 3.2. HTML 3.2 retreated from the HTML 3.0 generalized table specification and proposed a much simpler version. However, RFC1942 requested discussion and suggestions regarding a more extensive HTML table specification. Thus, the capability of table features for current and future versions of HTML is unclear. However, current HTML usage places table header information within &lt;TH&gt;&lt;/TH&gt;tags.
Regardless of what standard is eventually adopted, those creating HTML will often not specify supplemental information for tables or graphs for reasons that extend from limited awareness of HTML to limited time to create a HTML document. Thus, WWW browsers that can automatically detect, extract and display supplemental information from tables and graphs are preferred over those limited to detecting and displaying only explicit supplemental information such as delimited by &lt;TH&gt;&lt;/TH&gt;tags.
Supplemental information is generally provided at the edges of a table. That is, cells on the top and bottom rows; and the left and right columns--the boundary cells. This supplemental information is generally different (thus distinguishable) from the primary information contained in the body of the table. Where the primary information may be numerical, the supplemental data may be text; where the primary information is text, the supplemental information is generally formatted differently so as to stand out from the primary information. In a graph, the supplemental information is generally positioned along an axis and/or floating outside the bounds of the graph (such as a legend or title) and are also distinguishable from the primary information. These characteristics of tabular and graphical supplemental information lend themselves to heuristic detection, classification and extraction of the supplemental data from a table or graph when the supplemental data is not explicitly defined.
Although the invention applies to information display apparatus in general, WWW browser applications executing on a computer are representative of the technology. As such, the following describes the invention within the context of a preferred embodiment of a WWW browser application. However, one skilled in the art will understand that the invention generally can be applied to applications and apparatus that present tabular or graphical data (or other data having a supplemental part and a primary part) to a user on a computer controlled display device. Further, one skilled in the art will understand that the problem exists for any large amount of tabular or graphical data that is presented to a user on a limited size display. Thus, for example but without limitation, spreadsheets, word processing, and presentation applications have a need satisfied by the invention.