FIG. 1 shows an environment in which a data processing application 100 is executed so as to edit a structured document by processing documents containing structured data 102. The data processing application 100 is exemplary and can generally be described as processing structured data 102 expressed in a markup language so as to transform the structured data 102 using a solution module 104 to produce transformed information. During the process, the structured data can be presented as a rendering of a visual surface 106 (also referred to herein as a document view 106) on an output device. An editing user 108 interacts with the visual surface 106, as indicated by arrow 110, using, for instance keyboard 112, mouse device 114, or some other input device. The visual surface 106 can constitute the presentation of an electronic form having data entry fields associated with the structured data 102. In this case, the editing user 108's interaction 110 can involve the editing user 108 filling information into existing data entry fields of the electronic form, inserting and filling in new fields (as in table rows) or deleting or substituting regions of the editing surface that represent data subtrees.
The structured data 102 is a markup language. By way of example, and not by way of limitation, the markup language can be represented in Extensible Markup Language (XML). Accordingly, the structured data 102 is hereinafter referred to as an XML document 102. XML, which is documented as a W3C Standard set forth in Paoli et al., 1998, W3C recommendation, enables developers to create customized tags that describe the meaning of data, as opposed to the presentation of data.
The environment in which the data processing application 100 operates includes an Extensible Stylesheet Language Transformations (XSLT) processor that translates an XML document 102 into the visual surface 106 The visual surface 106 can also comprise another XML document, or a document expressed in a presentation-oriented markup language, such as Hypertext Markup Language (HTML). XML provides tags that represent the data contained in a document. In contrast, presentation-oriented languages, such as Hypertext Markup Language (HTML), provide tags that convey the visual appearance of a document. Accordingly, these technologies complement each other; XML allows information to be efficiently transferred and processed, while HTML allows information to be presented for display.
XSLT itself uses an XML syntax. The XSLT processor performs its translation function by making reference to one or more XSLT stylesheets. The XSLT stylesheets contain a collection of rules for mapping elements in the XML document 102 to the visual surface 106 or document view 106. To perform this function, XSLT defines its operands through XPath. XPath is a general-purpose query language for addressing and filtering the elements and text of XML documents. XPath expressions can address parts of an XML document, and can manipulate strings, numbers, and booleans, etc. In the context of the XSLT processor, XPath expressions can be used to select a portion of the XML document 102 that matches a prescribed match pattern, and then perform some translation operation on that portion using a rule provided in the XSLT stylesheets. XML, XSLT, and XPath are described at length in their governing specifications provided by the World Wide Web Consortium (W3C).
The XML document 102 is composed of XML elements, each of which includes a start tag (such as <author>), an end tag (such as </author>), and information between the two tags (which is referred to as the content of the element). An element may include name-value pairs (referred to as attributes) related by an equal sign (such as MONTH=“May”). The elements in the XML document 102 have a hierarchical relationship to each other that can be represented as a data tree 116. The elements in the data tree 116 are also commonly referred to as “nodes.” All elements are nodes, but the converse is not true. Data tree 116 is also referred to as a tree t having nodes n, where tree t and nodes n. As used herein, attributes, attribute values, and text content are all nodes.
A so-called XML schema (not illustrated in FIG. 1) is a particular XML language that provides a syntactic description of an XML structure, for instance XML document 102 and its corresponding data tree 116. If an XML structure is an instance of the schema that it refers to, it is said to be valid according to that schema. Stated otherwise, nodes in the schema are defined using syntactic constructs. For instance, constructs can be used to group nodes that do not have a common explicit parent (e.g., repeating sequences of nodes).
The solution module 104 includes a data mapping module 118. The purpose of the data mapping module 118 is to map the structured data 102 to the visual surface/document view 106. The data mapping module 118 can perform this task using so-called stylesheets, such as stylesheets written using XSLT. XSLT maps the structured data 102 to a format appropriate for presentation, such as HTML, Extensible Hypertext Markup Language (XHTML), etc. In other words, documents expressed in XML include tags that are particularly tailored to convey the meaning of the data in the documents. The XSLT conversion converts the XML documents into another markup language in which the tags pertain to the visual presentation of the information contained in the documents. (To facilitate discussion, the following description assumes the use of HTML to render the documents; however, other presentation-oriented markup languages can be used to render the documents.) Because HTML is a markup language, it can be conceptualized as a view tree 120 that includes a hierarchical organization of nodes, as in the case of data tree 116.
The schema for data tree 116 can have a node n that represents a table or a repeating field, where the node n can correspond to many nodes in the actual data for data tree 116, as well as many nodes in a form template 130 displayed on the visual surface 106. By way of example, the schema can define a format for one (1) date field as to what the date should look like and can define that many dates can be entered in succession for that one date field. Accordingly, there can be a corresponding number of nodes for the dates in the data for the data source as well as in the form template 130 for the visual surface 106. The data source, however, will have only one (1) node for the date field. The reader is referred to the World Wide Web Consortium's specifications for background information regarding XML and XSLT. Arrow 126 represents mapping of information from tree t having nodes n within the data tree 116 to information in the view tree 120.
A view mapping module 122 enables nodes in the data tree 116 to be mapped to corresponding nodes in the view tree 120, and vice versa. The mapping of nodes in the view tree 120 to nodes in the data tree 116 allows the solution module 104 to correlate editing operations performed on the visual surface/document view 106 with corresponding nodes in the underling structured data 102. This allows the solution module 104 to store information entered by the editing user 108 at appropriate locations within the structured data 102 during an editing session. Arrow 124 represents the mapping of information in the view tree 120 back to associated information in the data tree 116.
By way of broad overview, the view mapping module 122 provides mapping between the visual surface/document view 106 and the XML document 102 by adding annotations to the view tree 120 used to render the visual surface/document view 106. These annotations serve as references which point back to specific locations in the data tree 116. FIG. 1 represents the annotation of the visual surface/document view 106 by showing an annotated HTML document 128 being output from the solution module 104.
The visual surface/document view 106 itself has an appearance that is determined by both the information contained in the XML document 102 as well as the effects of the XSLT transformation provided by the data mapping module 118. Generally, in the case of electronic forms, the visual surface/document view 106 typically includes a hierarchical structure which is related to the hierarchical structure in the XML document 102. For instance, the exemplary electronic form template 130 includes multiple sections pertaining to different topics that reflect the topics in the XML document 102. (However, it is not necessary to have a one-to-one direct correspondence between the organization of the XML document 102 and the organization of the visual surface/document view 106; in other words, the transformation of the XML document 102 to the visual surface/document view 106 is generally considered non-isomorphic). Each section in the exemplary electronic form template 130 can include one or more data entry fields for received input from the editing user 108, such as data entry field 132. The data entry fields are also referred to herein as “editing controls.” Different graphical components can be used to implement the editing controls, including text boxes, drop-down list boxes, list boxes, option buttons (also referred to as radio buttons), check boxes, and so on. FIG. 5a, to be described below, provides examples of the visual appearance of an electronic form template as it is being used by an editing user to enter and/or edit data via the data entry fields thereon.
Path 134 generally represents the routing of information entered via the electronic form template 130 back to the XML document 102. In another words, the data entry fields in the electronic form template 130 (such as data entry field 132) are associated with respective nodes in the data tree 116. Entry of information via electronic form template 130 will therefore prompt the solution module 104 to route such information to appropriate storage locations in the data tree 116. Again, the linking between the electronic form template 130 and the XML document 102 is provided by the view mapping module 122.
The functionality provided by the solution module 104 is defined, in part, by a solution file, such as exemplary solution file 136 stored in storage 138. The solution file 136 essentially constitutes an electronic form template, providing all of the semantic information required to transform the XML document 102 into the visual surface/document view 106. Different XML documents may have been created by, or otherwise refer to, different electronic form templates. Accordingly, different XML documents may have different solution files associated therewith. Various techniques can be used to retrieve a solution file that is associated with a particular XML document. For instance, an appropriate solution file can be retrieved based on URN (Uniform Resource Name) or URL (Uniform Resource Locator) information contained in the header of an input XML document. That header information links the input document to a corresponding solution file. A storage 140 represents an archive for storing one or more XML documents created by, or otherwise associated with, respective solution files.
FIG. 2 shows an exemplary composition of the solution file 136. As shown there, the solution file 136 contains a collection of files (202, 204, 206, 208, and 210) that together provide semantic information used, in part, to implement the solution module 104. This collection of files can be packaged together. In one exemplary implementation, this collection of files is referred to using an extension ‘.xsn’. A form definition file 202, also called a manifest file, forms the centerpiece of the collection. The form definition file 202 contains information about all of the other files in the solution module 104. A design component which is used when an electronic form is being created so as to contain various editing controls, including text boxes, drop-down list boxes, list boxes, option buttons (also referred to as radio buttons), check boxes, and so on. Some of these controls may be included in the forms definition file 202. This file 202 is assigned the exemplary extension ‘.xsf’.
A schema file 204 is used to constrain and validate the XML document 102. This file is assigned the exemplary extension ‘.xsd’. View files 206 are used to transform the XML document 102, for presentation as views (visual surfaces 106). These files are used to implement the data mapping module 118 discussed in connection with FIG. 1. There can be multiple view files 206 corresponding to multiple possible views (i.e., visual surfaces 106) that the editing user 108 can select from. The view files 206 are assigned the exemplary extension ‘.xsl’. A default data file 208 contains default data that can be initially displayed in the view when an editor user 108 first opens the electronic form, and has not yet begun to edit the fields. This file 208 is assigned the exemplary extension .xml. Finally, business logic files 210 provide programming code used to implement specific editing behavior, data validation, event handlers, control of data flow, and other features. Such programs can be written in any kind of language, such as the JScript® or VBSCRIPT scripting languages. In this case, these files are assigned the exemplary extensions ‘js’ or ‘.vb’ (for JScript® and VBSCRIPT scripting languages, respectively).
FIG. 3 shows an exemplary architecture 300 for an electronic forms application that can be used to both create and fill out an electronic form. The architecture 300 includes a solution design component 302 for building a solution corresponding to a data file for which the electronic form can be used, an XML runtime component 304 to enter and view data in the electronic form, and optionally one or more exemplary XML solutions 306. Each of the components of the architecture 300 will now be discussed.
The solution design component 302 of the architecture 300, such as is seen at reference numeral 302 in FIG. 3, allows a solution to be built. The solution design component 302 provides a user interface (UI) to handle all the design requirements for common XML solutions. The result of the solution design component 302 is the set of files that represent a corresponding XML solution file 136. The structure of the XML solution file 136 declaratively defines the output of the solution design component 302. Included in the solution design component 302 are an XSL editor and solution builder 310. Any script editor can be used to edit business logic script used in the electronic form. The supporting files 312 communicate with one or more application files 308 that are useful in building the XML solution file 136 for an XML document 102.
In one implementation, the solution design component 302 provides a WYSIWYG forms designer and editor based on XML standards that can be used for generic XML schemas. As such, XSL editor and solution builder 310 need not be characterized as including an XML editor. Moreover, notepad 314 and support files 312 need not be present.
The runtime component 304 includes an editor frame 320 that includes XML editing 322. The XML editing 322 includes capabilities for an Instantiated Content Model (ICM). The ICM, as previously disclosed, allows for a minimized expression of all of the possible portions of the XML fragments that can be inserted or deleted when the electronic form is being filled out by the editing user 108. This minimized expression in turn reduces the size of the solution infrastructure 324, discussed below, which in turn improves the performance of the rendering of the electronic form. The XML editing 322, in conjunction with the instantiated content model, enables the editing user 108 to validly fill out the electronic form without latency induced by the size of the solution infrastructure 324.
In addition to the foregoing, the editor frame 320 bidirectionally communicates with the solution infrastructure 324, such as XML solution 302 seen in FIG. 3. Each of the solution infrastructure 324 and the XML store 316 bidirectionally communicates with one of more XML documents 330. Additionally, the solution infrastructure 324 communicates with the one or more application files 308. As seen in FIG. 2, the XML document 102 points to the solution file 136 that should process the XML document 102 on a computing device (e.g., a personal computer). When the editing user 108 uses the computer device to navigate to the XML document 102, the solution infrastructure 324 loads the required the solution file 136. If needed, the solution file 136 handles any contextual user interfaces (UI), runs business logic associated with the XML document 102 (e.g., business logic 210), and enforces security for all operations of the computing device.
The XML solution infrastructure 324 allows the editing user 108 of the computing device to access various XML data sources on the computing device, in an intranet, as well as on an extranet or the World Wide Web. Given the foregoing, XML Documents 330 can be displayed and edited using the XML Editing 322 of the editor frame 320.
Various exemplary solution files 340 can be provided to the editing user 108 of the computing device as part of the architecture 300, where the editing user 108 would like to see sample or exemplary solutions from which the user can learn about the data processing application 100. Exemplary solution files 340 can provide the editing user 108 with a guide for customizing electronic forms and for building new solutions based on the exemplary solutions.
FIG. 4 shows an overview of an exemplary apparatus 400 for implementing the data processing application 100 shown in FIG. 1. The apparatus 400 includes a computer 402 that contains one or more processing units 404 and memory 406. Among other information, the memory 406 can store an operating system 408 and the above-described data processing application 100, identified in FIG. 4 as a forms application 410. The forms application 410 can include data files 412 for storing the structured XML document 102, and a solution module 414. The solution module 414 comprises logic that specifies the appearance and behavior of the visual surface 106 as was described in connection with FIG. 1. The logic provided by solution module 414 is, in turn, determined by a solution file (such as a solution file 136 composed of the files shown in FIGS. 1-2). The computer 402 is coupled to a collection of input devices 416, including the keyboard 112, mouse device 114, as well as other input devices 418. The computer 402 is also coupled to a display device 420.
In one exemplary implementation, the forms application 410 includes a design mode and an editing mode. The design mode presents a design UI 422 on the display device 420 for interaction with a designing user 424. The editing mode presents an editing UI 426 on the display device 420 for interaction with the editing user 108. In the design mode, the forms application 410 creates an electronic form 428, or modifies the structure of the electronic form 428 in a way that affects its basic schema. In other words, the design operation produces the solution file 136 that furnishes the electronic form 428. In the editing mode, the editing user 108 uses the electronic form 428 for its intended purpose—that is, by entering information into the electronic form 428 for a business-related purpose or other purpose.
In the design mode, the forms application 410 can be configured to depict the electronic form 428 under development using a split-screen display technique. More specifically, a forms view portion 430 of the design UI 422 is devoted to a depiction of the normal appearance of the electronic form 428. A data source view portion 432 of the visual surface is devoted to displaying a hierarchical tree 434 that conveys the organization of data fields in the electronic form 428.
An exemplary designing UI 422 can allocate the visual surface 206 into the forms view portion 430 and the data source view portion 432. As described above, the forms view portion 430 contains a depiction of the normal appearance of the electronic form 428—in this case, an exemplary form template 500a seen in FIG. 5a. The electronic form 428 can includes a plurality data entry fields. The data source view portion 432 includes the hierarchical tree 434 showing the nested layout of the data entry fields presented in the electronic form.
The forms application 410 provides multiple techniques for creating the electronic form. According to one technique, the electronic form can be created from scratch by building the electronic form from successively selected editing controls. In another technique, the electronic form can be created based on any pre-existing .xsd schema document (e.g., see schema 204 in FIG. 2) loaded into the forms application 410. The .xsd schema is an XML file that defines the structure and content type of the XML files that are associated with it. In another technique, the electronic form can be created based on an XML document or file. The forms application 410 will then create a schema based on the information in the input XML document or file. In another technique, the electronic form can be created based on a database schema. In this case, the forms application 410 will extract the schema of the data and convert that record set to an XML representation. Still other techniques can be used to create electronic forms.
Once a form has been created, its design (and associated schema) can be further modified. For example, the forms application 410 allows the designing user 424 to modify existing editing controls used in the electronic form, or add additional editing controls.
The creation of the electronic form also creates an associated solution file. The solution file effectively forms a template that can be archived and subsequently used in a business (or other environment). FIG. 5a demonstrates an exemplary use of the exemplary electronic form 500 after it has been created in the design mode of operation of the forms application 410. More specifically, FIG. 5a shows the presentation of the exemplary electronic form 500a in the editing mode of operation of the forms application 410. In this case, the editing user 108 can enter data into the data entry fields in the editing UI 426. For instance, the editing user 108 can enter text at reference numeral 502 into a text field 510a. The editing user 108 can select a particular part of the exemplary electronic form 500a in a conventional manner, such as by pointing to and clicking on a particular field in the exemplary electronic form using the mouse device 114.
Data entry fields in the electronic form are mapped to underlying structured XML document 102—in this case, an XML document 520a which is represented as a tree t having a plurality of nodes ni. This mapping is achieved via annotations added to the HTML document used to render the exemplary electronic form 500a. More specifically, the annotations act as references which point to particular parts of the XML document 520a associated with the data entry fields in the exemplary electronic form 500a. Through this mechanism, the data entered by the editing user 108 is routed back to the XML document 520a and stored in its data structure at appropriate locations. This mapping functionality is represented in FIG. 5a by the arrow 518a. 
One exemplary implementation includes a method that applies an XSLT stylesheet to an XML document to create an HTML view. At least some of the HTML elements in the HTML view are associated with a specifically named attribute. The HTML elements that are associated with the specifically named attribute have respective corresponding XML nodes in the XML document, where the location of each XML node in the XML document is determined by the value of the specifically named attribute. Once edits to the HTML elements associated with the specifically named attribute have been received in an interactive session with an editing user, the received edits are saved back into the nodes in the XML document that respectively correspond to the HTML elements associated with the specifically named attribute.
Electronic form 500a is displayed in the editing UI 528 by the forms application 410 so that an editing user 108 can enter data into the depicted data entry fields of a corresponding data entry screen. The data entry fields 504a, 506a, 508a, 510a, 512a, and 514a on the data entry screen are being used to collect information. Information is kept in a schema associated with the underlying structured XML document 102 represented by the XML document 520a as to what will be considered to be valid data that can be entered into the data entry fields for the electronic form 500a. Once validated, these data are then subjected to a mapping operation 518a for entry into the XML document 520a. Business logic for validation of the data being entered can be quite varied and can be stored so as to be associated as definitions for the electronic form 500a (i.e., in FIG. 2, see form definition (.XSF) 202 and business logic file 210 for storage of validation criteria).
Each data entry field has a corresponding place in the XML document 520a seen in FIG. 5a. The data entry fields 504a, 506a, 508a, 510a, 512a, and 514a respectively corresponds to the nodes 504b, 506b, 508b, 510b, 512b, and 514b in the XML document 520a. Repetitive sets of data can be entered for each field 506a, where each such data set can be represented by the data entry fields 508a-512a, and where each field 506a can have from zero to infinity different data sets. In this case, these data sets are represented in the XML document 520a by nodes 508b-512b(1-I), where from 1 to “I” different data sets can be provided for each field 506a. 
The exemplary electronic form template 500a is designed around a data source, which here is XML document 520a, also referred to herein as a data tree 520a. After designing the electronic form template 500a, a form designer may wish to modify the data source. Referring now to FIGS. 5a-5b, the form designer may wish to modify data tree 520a, which can be understood as a tree t having nodes n that is to be modified into a tree t′ having nodes n′, as seen by a data tree 520b. A reason for such modifications can be that changes are needed after the data source 520a has been developed, where the data source definition (e.g., the schema for data source 520a) changes during development. Also, a first version of form template 500a that is designed against a first version of a data source 520a may need to be changed so as to create a second version of the form template 500b that is designed against a second version of that data source 520b. In these cases, it is desirable to reuse the previous forms design work by using the first version of the form template 500a as a starting place to design the second version of the form template 500b that is designed against the second version of that data source 520b. 
Since there are a large number of dependencies between various pieces of information between the form template and the data tree (e.g., data source), it is difficult to change the data source into a new or modified data source while making accurate and precise changes that properly correspond to the form template. Examples of such dependencies include, but are not limited to, rules for validating data entered into fields of the form template, rules for binding a data entry field to a corresponding field in the data source, business logic, the promotion of properties from a data entry field to other data entry fields, initial formatting of data in a data entry field, etc. It would be an advantage in the art to provide an operation by which both an original data source and a new or modified data source can be specified. The operation would then detect changes between the original data source and the new or modified data source. The detected changes would then be used to make corresponding updates to dependencies in the form template. As a result, a correspondingly new or modified form template would be produced.