The present invention generally pertains to a method for translating data produced in a first program into a format required for use in a second program, and more specifically, to a method for ensuring that data translated from a program used to create the data into a hypertext markup language (HTML) format retain formatting and functionality during the translation.
A characteristic of many applications programs is that they produce data in a format that is specific to some functionality of the application program. Loss of functionality in data translated from one format to another will not always be immediately apparent. For example, in Microsoft Corporation""s EXCEL(trademark) spreadsheet program, a red color font may be applied to a number in a cell to indicate that the value is negative, in accord with defined Number Formatting Rules. However, a value may be shown in a red color font in another cell of the spreadsheet simply because a user arbitrarily selected red formatting for the font in that cell. If this spreadsheet is then translated to HTML format under prior art methods, the functional formatting indicative of a negative value will be lost. Although any values that were in a red font in the spreadsheet cells will also be in a red font in the translated data within an HTML document, the HTML document will not associate a negative value with the red font in those cells where this functionality existed in the original spreadsheet. If it becomes necessary to transfer the translated data from the HTML document back into a spreadsheet, the loss of the negative value functionality in those cells where the red font was originally used to indicate a negative value will cause the resulting spreadsheet to inaccurately represent the data. Understanding how this problem might be addressed requires a brief discussion of the use and format of HTML documents.
With the widespread use of the Internet and of corporate or business intranets, it is becoming increasingly common to translate data from application-specific file formats into HTML file formats, to enable the data to be readily transmitted over a network and viewed in browser programs. HTML documents or files have thus become the universally accepted format for sharing data xe2x80x9con-line.xe2x80x9d An on-line information system typically includes a server computer system that makes information available so that client computer systems can access the information. The server and client computer systems are usually connected in either a local area or a wide area private intranet system, or via the public Internet. A unique uniform resource locator (URL) is associated with each HTML document, enabling client computer systems to request a specific HTML document from a server computer system.
An HTML document includes a hierarchical set of markup elements; most elements have a start tag, followed by content, followed by an end tag. The content is typically a combination of text and nested markup elements. Tags, which are enclosed in angle brackets (xe2x80x98 less than xe2x80x99 and xe2x80x98 greater than xe2x80x99), indicate how the document is structured and how to display the document, i.e., its format. There are tags for markup elements such as titles and headers, for text attributes such as bold and italic, for lists, for paragraph boundaries, for links to other documents or other parts of the same document; for graphic images, for non-displayed comments, and for many other features. Further details regarding HTML may be found in reference books such as xe2x80x9cHTML For Dummies,xe2x80x9d by Ed Tittel and Steve James (1996).
The following lines of HTML briefly illustrate how the language is used:
Here we start a new paragraph  less than P greater than .
Some words are  less than B greater than bold less than /B greater than , others are  less than I greater than italic less than /I greater than .
The viewer of the document will see:
Here we start a new paragraph.
Some words are bold, others are italic.
As noted above, a user who wishes to retrieve and display an HTML document generally uses a Web browser program. Two of the popular Web browser programs are: NAVIGATOR(trademark) from NetScape Communications Corp. of Mountain View, Calif., and INTERNET EXPLORER from Microsoft Corporation of Redmond, Wash. The primary functionality of web browsers is directed to finding, retrieving, and displaying documents. A browser is generally not intended for word processing or data manipulation of the information contained within an HTML document, but can display documents or data generated by word processing or spreadsheet applications, once converted into an appropriate HTML compatible format.
A wide variety of data may be shared among different users in a network environment using HTML. Typical HTML documents include images, text, and data. HTML documents can be created using programs specifically designed for that purpose, such as Microsoft Corporation""s FRONTPAGE(trademark) Web Page publishing program. Additionally, some applications, such as Microsoft Corporation""s WORD 97(trademark) word processing program, allow a user to save a text document as an HTML document. Microsoft Corporation""s EXCEL 97(trademark) spreadsheet program also enables a user to save a data table or chart created in a workbook as an HTML file.
As noted above, a characteristic of many application programs is specific formatting of data that is unique to some functionality of that application. It would be desirable to persist application-specific formatting information when translating data from one file format to a different file format, such as the HTML format, so that the data could be reintroduced into the original application with all its original formatting and functionality intact.
In addition to the functionality related to indicating negative numbers with a red font noted above, there are additional EXCEL spreadsheet functions that may be similarly adversely affected by translation into another format such as HTML. In EXCEL 97, text that exceeds the width of a column may be displayed as spilling into an adjacent cell to the right, without actually being merged into that adjacent cell. HTML does not support this function, and will instead treat the text displayed across two cells as being merged into the two cells. It would be desirable that when such xe2x80x9cspilled textxe2x80x9d is translated into HTML, the xe2x80x9cspilled textxe2x80x9d formatting information is retained in the HTML file, even if it is not supported in HTML, so that if the HTML file is translated back into the parent EXCEL spreadsheet application, the xe2x80x9cspilled textxe2x80x9d is correctly formatted as being associated with only one cell.
Currently when a data table is translated into HTML, any empty cells or white space in the data table require a significant amount of coding to be correctly represented in the HTML document. Data tables can often have a significant amount of such empty space. Thus, a table containing relatively little data, yet many empty cells, can generate a relatively large HTML document when translated into HTML format. Clearly, it would be desirable to improve the speed with which the translated data table inserted in an HTML document is rendered by a browser and reduce the amount of HTML coding data (file size) required to define empty cells in a block of cells imported into a web page, while at the same time ensuring that the spreadsheet specific formatting is retained if it becomes necessary to reintroduce the data table from the HTML document back into an EXCEL spreadsheet.
Another aspect of translating a data table from a spreadsheet file format into HTML format is that the appearance of a data table as displayed in the HTML document is often less aesthetic than the appearance of the data table in its parent application. This loss of appearance quality is primarily due to the positioning of the text in the data tables as displayed in HTML. Generally, the text is displayed immediately next to a border of a cell, often making the text difficult to read, because it runs into the contents of the adjacent cell. It would be desirable to be able to add padding (empty spaces) so that text in data tables is easier to read when translating a data table from a spreadsheet application file format into HTML format, and it would be further desirable that such padding not interfere with the appearance of the data table if it is translated back into its parent spreadsheet application from the HTML format.
In accord with the present invention, a method is defined for translating data having a format and functionality specific to a parent application into a different application format, such that any formatting changes required to translate the data into the different application format are reversible, enabling the data to be reintroduced into the parent application without loss of the format and functionality that the data previously exhibited in the parent application. The method includes the step enabling a user to select the data to be translated, from within the parent application. The data selected are translated from the parent application format into the different application format. Included within the translated data is a marker associated with each formatting change required to resolve any conflicts between a first format and first functionality available in the parent application, and a second format and second functionality available in the different application. The data translated into the different application format are then included within a file of the different application. Subsequently, a user is enabled to reintroduce the data that were translated, into the parent application. Wherever a marker denoting a formatting change is found in the data being reintroduced, that formatting change is ignored by the parent application; however, the first format and first functionality is reapplied to the data reintroduced into the parent application.
The step of incorporating the data that were translated into the file of the different application provides for including the first format and first functionality information within that file, but in a way that the first format and first functionality information is ignored by the different application. The step of enabling a user to reintroduce the data into the parent application includes the step of using the first format and first functionality information to recreate the original formatting of the data.
The parent application preferably comprises a spreadsheet program, in which case, the data that are translated are selected from a spreadsheet and include either a data table or a chart. However, other types of parent applications are contemplated.
Also, in a preferred embodiment of the present invention, the different application format includes components of HTML. Any formatting changes are then made using at least one of the group including HTML tags, attributes, and cascading style sheet (CSS) properties.
The marker associated with any formatting change preferably comprises a CSS property. The CSS property comprising the marker is then preferably an MSO-Ignore property.
Formatting change may include adding padding so that the data are more aesthetically displayed in a browser. The conflict between the format and functionality originally associated with the data in the parent application and the format and functionality available in the different application may arise due to spilled text, and if so, the formatting change preferably comprises using a colspan attribute.
When a conflict between the format and functionality originally associated with the data in the parent application and the format and functionality available in the different application arises due to a formatting command in the parent application that is specific to a function of the data, the method further provides for including the formatting command within the data being translated into the different application format in a form, such that the formatting command is ignored by the different application. The formatting command is then available to recreate the original formatting of the data if a user reintroduces the data into the parent application.
In one preferred form of the invention, the formatting command is defined by a user of the parent application. In one instance, the formatting command changes the appearance of the data in the parent application if the data change between negative and positive values.
The step of translating the data may further include the step of identifying any contiguous empty cells in the data selected from the spreadsheet. The HTML required to define the empty cells is then abbreviated by using a colspan attribute and/or a rowspan attribute.
It is also possible that a conflict between the format and functionality originally associated with the data in the parent application and the format and functionality available in the different application arises due to data that include either a shape or an image.
Another aspect of the present invention is directed to an article of manufacture that includes a medium in which machine instructions are stored that cause a computer to implement functions generally consistent with the steps of the method discussed above.
A still further aspect of the present invention is directed to a system for enabling formatting and functionality information to be persisted when data having a first format used by a parent application are translated into a second format, different from the first format, required of another application. The data translated can be returned to the parent application without loss of any format and any functionality that the data had in the first format. This system includes a memory in which a plurality of machine instructions are stored, a display, and a process coupled to the memory and the display. The processor executes the machine instructions to implement functions that are generally consistent with the steps of the method discussed above.