This application relates in general to the Internet web pages, and in specific to a visual web authoring tool.
In order to have complete control over their web site designs, web authors have been creating web sites by writing HTML source in text editors. Recently, visual HTML editors have been introduced with the intention of improving web authoring productivity. These visual editors haves not found much success with professional web authors since they completely re-write the HTML source document when the document is loaded into the visual editor. Note that HTML authors prefer to be able to edit code by hand even while using a visual editor, or after the code has been edited visually by someone else.
FIG. 1 depicts the operations of a prior art editor on HTML document 101. The HTML document 101 includes HTML mark up tags, such as paragraph tags 102 and bold tags 103. Note that these tags are by way of example only as other tags exist, such as size, color, font, etc. These tags indicate to the browser the manner in which to display the text 104 or other objects to the browser user. The editor begins by loading the document into memory, and parsing the document 101 into an internal data structure 105, which may be a tree structure. The nodes in the tree 105 represent the HTML tags, while the children 107 of the nodes are either other nodes or text. The editor then displays or renders the tree to the editor user as it would be viewed on a browser. This view 108 is known as the rendered view or WYSIWYG (what you see is what you get) view. The text that is marked with bold tags 103 is displayed in bold format 109.
The editor allows a user to edit the HTML document as displayed in the rendered view 108. For example, suppose the user edits the document to remove the bold format from the text. The user selects the rendered word bold 109, and selects the unbold button 110. The editor then changes the tree 105 by discarding the bold (b) node and making the text xe2x80x9cboldxe2x80x9d a child of the paragraph (p) node 106. The editor then renders the tree into the WYSIWYG view 108 with the text unbolded 111. Note that the tree 105 is stored in memory and is not viewed by the user. At the conclusion of editing, the user is prompted as to whether to save the changes. If the user decides to save the changes, the editor regenerates the HTML document 113 from the tree 105.
The prior art approach as depicted in FIG. 1 has several problems. FIG. 2 depicts the problem of preservation of format. As shown in FIG. 2, the HTML document 201, as created by the author, does not have each paragraph end with a  less than /p greater than . During editing, the document 201 would be parsed into tree 202. At the conclusion of editing, the HTML document 203 would be regenerated. However, during regeneration, the editor would reform the document with the  less than /p greater than  at the end of each paragraph. The editor does not track the format of the tags by the author. Thus, when the editor reforms the document, the editor makes assumptions as to the use of tags. Therefore, whereas the original document lacked the ending  less than /p greater than  tags, the reformed document has the ending  less than /p greater than  tags. Moreover, the editor places the opening and ending paragraph tags on separate lines. Note that this problem arises because the editor does not preserve any of the formatting information from the original document. Consequently, HTML authors do not have control over their documents, as the document that they created will appear different from the edited, reformatted document. This makes reading, reviewing, and making further changes to the document difficult, since the document does not appear to be the same document as that created by the author.
FIG. 3 depicts another problem with the prior art approach shown in FIG. 1, the preservation of structure. As shown in FIG. 3, the author has bolded several paragraphs by placing the appropriate tags so as to surround the desired paragraphs 301. However, the HTML language standard only allows nested tags to appear in a particular order, e.g. block tags must surround character tags. Thus, the bold tag cannot be placed around a paragraph tag, but rather should be placed inside the paragraph tags. Note that HTML browsers would tolerate such an error, and would render the web page in a correct manner. However, the editor would correct the error during parsing the document 301 into the tree 302. Note tree 302 corresponds to the source document as if the editor did not correct the error. The editor would not have placed the bold tags from the parent node position of the paragraph nodes, and instead create new bold nodes as child nodes of the paragraph nodes 303. During regeneration, the reformatted document 304 would have multiple bold tags that are located within the paragraph tags, instead of a single set of bold tags surrounding the paragraph tags. Thus, upon a subsequent edit, if the author desires to remove the bold tags, then the author will have to remove each of the newly created tags from within each paragraph, instead of removing only a single bold tag.
FIG. 4A depicts another aspect of the problem shown in FIG. 3, i.e. the structure of the original document is not well preserved. In FIG. 4A, overlapping tags 401 are corrected by the editor, so that tags are in order 402 from the inside out. FIG. 4B depicts a list item that is not in the list 403. In this case, the editor will place UL tags 404 around the item. Thus, again the author has lost control over the created document, which has been altered by the editor.
FIG. 5 depicts a prior art editor that maintains a copy 502 of the original document 501 during a portion of the processing. After the editor loads and parses the document 501 into tree 503, the editor maintains copy 502. However, when edited 504, the document 502 is reformatted and restructured.
FIG. 6 depicts the problem of modal editing of the prior art editor of FIG. 1. Some prior art editors do not allow for editing of the document. In other words, all editing is performed in the visual editor, and the source document is hidden from the author. Thus, the only mode of editing is in the visual editor. However some prior art editors allow editing of the source document, but it is model: either the source document or renderer version can be edited, but not both at the same time. In switching back and forth between the two modes 601, 602, the document would be reformatted and restructured such that it would be unrecognizable by the author.
Therefore, there is a need in the art for an editor that allows web authors to edit HTML visually while preserving the HTML source document. This would preserve the structure and the format of the HTML document, while allowing modeless editing.
These and other objects, features and technical advantages are achieved by a system and method which allows web authors to edit HTML visually while preserving the HTML source document. Thus, the structure and format of the HTML document is preserved, and modeless editing is permitted. The invention preserves the source document exactly as it is written when it is opened in the visual editor. Moreover, when an edit is made with the invention, only a portion of the HTML source document around that edit is updated, rather than rewriting the whole HTML source document. Furthermore, when an edit is made, the new HTML source code in the edited range is outputted in a format that is specified by the user.
In order to preserve the format of the document, format information is stored in the parsed tree. Thus, each node in the tree includes information on the format of the text and objects of the node. Note that formatting information may also be stored in the text in the tree. Any edits on that particular node will result in changing the information of that node only (if possible). Other nodes will not be reformatted unless necessary. Moreover, the edited node will be reformatted according to the user""s preferences. For example, the user""s preferences may specify line breaks before each paragraph. Thus, when the node is reformatted, the editor will place line breaks before each paragraph.
In order to preserve the structure of the document, invalid HTML structures are maintained and not corrected (unless the user so specifies). The invention will either support the invalid structure by reflecting such structure in the parsed tree, and thus allow for editing of the structure, or the invention will not support such a structure. With unsupported structures, the authors are offered a choice. Either the invalid and supported structure may be maintained, and thus remain un-editable, or the structure may be corrected, and thus made editable. Those invalid, unsupported structures that the author has chosen to maintain are represented in the tree as invalid nodes. Note that these invalid, unsupported structures may be manually deleted by the user in the visual editor, thus making their document valid again. Further note that the user can choose to have all correctable invalid HTML be automatically rewritten (not preserved) to make it fully valid. The invention supports most common types of invalid structures such as text mark up tags around block tags (e.g. bold around paragraph), content directly in UL or OL list, an LI tag outside of list, A tags inside other A tags, etc. Moreover, the invention also maintains structure while editing, as the structure of the document is only minimally modified during editing, i.e. only the nodes affected by the edits are restructured, and the remainder of the document is unmodified. Thus, the structure of the document is maintained.
The invention also supports modeless editing. The copy of the source document and/or the rendered window may be edited, and the changes made in one are reflected in the other. Moreover, both the source document and the rendered window may be displayed to the author simultaneously. Selections and edits in one will be reflected in the other. Note that changes made in the rendered window appear immediately in the source document window, while changes in the source document window appear in the rendered window after clicking out of the source document.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.