The present invention generally pertains to a method for storing non-HTML information related to a table of data in an HTML document so that the table of data are displayed in a browser program, and more specifically, to a method for enabling an HTML document to contain all the information related to a table of data that is required by a spreadsheet application to manipulate that table of data.
With the widespread use of the Internet and of corporate or business intranets, it is becoming increasingly common to translate data from application-specific file formats into HTML file formats, to enable the data to be readily transmitted as a web page and viewed with browser programs. HTML documents or files have thus become the universally accepted format for sharing data xe2x80x9con-line.xe2x80x9d
An HTML document includes a hierarchical set of markup elements; most elements have a start tag, followed by content, followed by an end tag. The content is typically a combination of text and nested markup elements. Tags, which are enclosed in angle brackets (xe2x80x98 less than xe2x80x99 and xe2x80x98 greater than xe2x80x99), indicate how the document is structured and how to display the document, i.e., its format. There are tags for markup elements such as titles and headers, for text attributes such as bold and italic, for lists, for paragraph boundaries, for links to other documents or other parts of the same document, for graphic images, for non-displayed comments, and for many other features. Further details regarding HTML may be found in reference books such as xe2x80x9cHTML For Dummies,xe2x80x9d by Ed Tittel and Steve James (1996).
The following lines of HTML briefly illustrate how the language is used:
Here we start a new paragraph  less than P greater than .
Some words are  less than B greater than bold less than /B greater than , others are  less than I greater than italic less than /I greater than .
The viewer of the document will see:
Here we start a new paragraph.
Some words are bold, others are italic.
As noted above, a user who wishes to retrieve and display an HTML document generally uses a Web browser program. Two of the more popular Web browser programs are NAVIGATOR(trademark) from NetScape Communications Corp. of Mountain View, Calif., and INTERNET EXPLORER(trademark) from Microsoft Corporation of Redmond, Wash. The primary functionality of web browsers is directed to finding, retrieving, and displaying documents. A browser program is generally not intended for word processing or data manipulation of the information contained within an HTML document, but can display documents or data generated by word processing or spreadsheet applications, once converted into an appropriate HTML compatible format.
A wide variety of data may be shared among different users in a network environment using HTML. Typical HTML documents include images, text, and data. HTML documents can be created using programs specifically designed for that purpose, such as Microsoft Corporation""s FRONTPAGE(trademark) Web Page publishing program. Additionally, some applications, such as Microsoft Corporation""s WORD(trademark) word processing program, allow a user to save a text document as an HTML document. Microsoft Corporation""s EXCEL 97(trademark) spreadsheet program also enables a user to save a data table or chart created in a workbook as an HTML file.
A characteristic of many applications is the use of specific formatting of data in a manner that is unique to some functionality of the application. Generally, some or all of this type of information is lost when the data are translated into HTML. If the data being translated is a table generated by a spreadsheet application, HTML has been very useful in enabling computer network users to view the table with a browser program. However, in the past, users have not been able to manipulate the data presented in a table, since browser programs have not supported such functionality. In addition, the table in the HTML document could not be restored to the original spreadsheet application, since not all of the information originally associated with the table when it was created with the spreadsheet application would have been retained in the HTML document.
To ensure that a data table translated into HTML might thereafter be available for use and manipulation in the spreadsheet program, it has been necessary to save both a file in the original spreadsheet format, and the HTML document in which the table from the spreadsheet was inserted. Often, a table created in a spreadsheet contains information that changes regularly (monthly sales reports, year-to-date profit figures, etc.), and these changes need to be entered in the HTML document to be available for view with a browser program. To accomplish this task, it was previously necessary to manage and update both filesxe2x80x94the spreadsheet file and the HTML file. To improve efficiency and productivity with which this task is accomplished, it would be desirable to include the functionality of a table created in a spreadsheet with the table after it is exported to an HTML document from the spreadsheet application, so that only the HTML file need be maintained. In this way, the table could be reintroduced into the spreadsheet program from the HTML document with all its original formatting and functionality intact, and network users would always have access to the most current table.
When a data table is created in a spreadsheet application, some of the information associated with the table and its functionality has no equivalent in HTML. For example, a formula relating to the manipulation of the data in a cell of the spreadsheet will not be readily conveyed by HTML. It would be desirable to include such information in an HTML document, so that the information is ignored when the HTML document is viewed by a browser program, yet is available to be used by the spreadsheet application if the table is imported from the HTML document and opened in the spreadsheet program.
Some of the information associated with a data table created in a spreadsheet program is related to similar information in HTML, but the information is used and stored differently in HTML than in the spreadsheet program format. An example of this arises in connection with the formatting and layout of a data table. Often spreadsheets employ formatting functionality that does not have a direct correspondence with formatting of the data table in HTML. In a spreadsheet program, a user can apply number formatting that displays a value in red if the value of the data in a cell is negative. A user can also arbitrarily change the color of a font for a cell to red. When a data table is translated into an HTML document, all of the cells with data shown as red will be displayed in HTML with a red font. However, HTML does not have the ability to associate the number formatting with a cell, or to distinguish those cells with values displayed in red because they are negative from cells in which the font is arbitrarily chosen to be red. If a cell value changes from negative to positive, the value in the cell should no longer be displayed in a red font, but HTML cannot make that determination, since it does not provide a way to save the number formatting rule for negative value data. It would be desirable to preserve such spreadsheet-specific functional formatting information within an HTML document, so that the information is available if the table is export from HTML document back into the spreadsheet program.
Currently, no prior art technique exists that enables virtually all function and formatting information to be preserved when a data table is exported from a spreadsheet into an HTML document, so that the information can be employed by the spreadsheet program if it becomes necessary to xe2x80x9cround tripxe2x80x9d the information back into a spreadsheet format file. The preservation of the functionality and formatting information associated with the spreadsheet formatted file when a data table is exported into: an HTML document should not adversely impact the functionality of the HTML document to view the data in a browser program.
In accord with the present invention, a method is defined for saving data having a format and functionality specific to a parent spreadsheet program into a hypertext markup language (HTML) format, such that the data are viewable by a browser program, and that all formatting information that was originally associated with the data within the parent spreadsheet program is also included within the HTML format, so that the data in the HTML format can be reintroduced into the parent spreadsheet program without loss of the format and functionality that the data previously exhibited in the parent spreadsheet program.
The method includes first enabling a user to select the data to be translated, from within the parent spreadsheet program and then incorporating the data from the parent spreadsheet program into an HTML document. Incorporating the data into the HTML document is accomplished by using HTML to represent data that will be displayed in a browser program, using Cascading Style Sheets (CSS) to represent cell-level properties of the data, using Extended Markup Language (XML) to represent information required for proper functionality in the parent spreadsheet program, but which is not required for the display of the data in a browser program, and saving the representations of the data in HTML, CSS, and XML in at least one file.
In one embodiment, the step of incorporating the data from the parent spreadsheet program into an HTML document further includes generating multiple files to define the HTML document. When the spreadsheet data includes multiple data tables, the method includes generating a separate file for each data table. A frame set is generated that includes a navigation file, which has links to each file representing a different data table.
In another embodiment, the method includes generating at least one supporting file. A supporting file can include information required for proper functionality in the parent spreadsheet program, but which is not required for the display of the data in a browser program, in either a binary format or an XML format. The supporting file can include image data in an image file. An additional type of supporting file is a CSS file that includes cell-level properties of the data.
In one embodiment, the XML used to represent information required for proper functionality in the parent spreadsheet program can define document-level properties of the data in the parent spreadsheet program, or the parameters of the last data sort that was executed in the parent spreadsheet program.
The XML and CSS used to incorporate the data into the HTML document can be included as separate linked files, or can be included within the header section of the HTML document. PivotTable data saved as XML-Data is always saved as a separate file. CSS are used to define cell-level properties including fonts, backgrounds, colors, number formatting, borders, the alignment of the data within the cell, etc. One embodiment includes the step of creating a new CSS property if a cell property of the spreadsheet data does not correspond to an existing CSS property. The MSO-Ignore property is one example of a newly created property.
In one embodiment, HTML is used to represent cell-level properties instead of using CSS, when that property is unique to a small number of cells, and using HTML would reduce the amount of code required. In another embodiment, to ensure that the appearance of the data displayed in a browser program will match the appearance of the data as displayed in the parent spreadsheet program, the HTML formatting has to be changed from the parent spreadsheet program formatting. When this is done, the original formatting of the data in the parent spreadsheet program is also incorporated into the HTML document, such that the original formatting is ignored by a browser program, but available to be used by the parent spreadsheet program to recreate the original formatting when the data from the HTML document is reintroduced into the parent spreadsheet program. This can be accomplished by including the MSO-Ignore property with the changed formatting; so when the data from the HTML document is reintroduced into the parent spreadsheet program the changed formatting indicated by the MSO-Ignore property is ignored by the parent spreadsheet program, and the original formatting information incorporated into the HTML document is used instead. If the original formatting is a cell-level property, that original formatting is incorporated into the HTML document using CSS. If the original formatting information is document-level or range-level formatting or functionality, that original information is incorporated into the HTML document using XML.
An alternate method for saving data created in a spreadsheet program into an HTML document, so that the data can be displayed in a browser program, and so that all the formatting information required for full functionality of the data within the spreadsheet program are preserved within the HTML document, regardless of whether that information is required for the display of the data in a browser program, is also provided. The steps include first enabling a user to select the data within the spreadsheet program to be saved; and then separating the data into groups. A first group is data that is not displayable by a browser program but which is required for some functionality related to the data within the spreadsheet program, a second group is data that is displayable by a browser program with an appearance that is substantially identical to that of the data when displayed by the spreadsheet program, and a third group is data that is displayable by a browser program, but requiring a formatting change so the appearance of the data when displayed by a browser program will be substantially identical to that of the data when displayed by the spreadsheet program.
The first group of data are incorporated into the HTML document using XML, so that the first group of data are ignored when the HTML document is displayed by a browser program, but is preserved and thus available to be used to reintroduce the first group of data from the HTML document into the spreadsheet program.
The second group of data are incorporated into the HTML document using HTML, such that cell-level properties of the data are incorporated into the HTML document using CSS, when an amount of code required to incorporate the second group of data into the HTML document can be reduced by the use of CSS. Alternately, cell-level properties of the data are incorporated into the HTML document using HTML tags or attributes, when an amount of code required to incorporate the second group of data into the HTML document can be reduced by the use of HTML instead of CSS.
The third group of data are incorporated into the HTML document using HTML such that the formatting change required from a first format associated with the display of the data by the spreadsheet program to a second format associated with display of the data by a browser program includes a marker associated with the formatting change. Information related to the first format is incorporated into the HTML document, such that the first format information is ignored when the HTML document is displayed by a browser program, but is preserved and thus available to be used to reintroduce the third group of data from the HTML document into the spreadsheet program and to recreate the first format. Cell-level properties of the data are incorporated into the HTML document using CSS, when an amount of code required to incorporate the second group of data into the HTML document can be reduced by the use of CSS. Cell-level properties of the data are incorporated into the HTML document using HTML tags or attributes, when an amount of code required to incorporate the second group of data into the HTML document can be reduced by the use of HTML instead of CSS.
The final step of the alternate method is saving the HTML document incorporating the first group of data, the second group of data and the third group of data in at least one file, said at least one file including the HTML document.
Another aspect of the present invention is directed to an article of manufacture that includes a medium in which machine instructions are stored that cause a computer to implement functions generally consistent with the steps of the method discussed above.
A still further aspect of the present invention is directed to a system for enabling an HTML document to support both the display of data parented in a spreadsheet program in a browser program as well as the opening that data, with its original formatting and functionality intact, in the spreadsheet program. This system includes a memory in which a plurality of machine instructions are stored, a display, and a process coupled to the memory and the display. The processor executes the machine instructions to implement the spreadsheet program with functions that are generally consistent with the steps of the method discussed above.