Extensible markup language (XML) is increasingly becoming the preferred format for transferring data. XML is a tag-based hierarchical language that is extremely rich in terms of the data that it can be used to represent. For example, XML can be used to represent data spanning the spectrum from semi-structured data (such as one would find in a word processing document) to generally structured data (such as that which is contained in a table). XML is well-suited for many types of communication including business-to-business and client-to-server communication. For more on XML, XSLT (extensible Style-sheet Language Transformation), and XSD (schemas), the reader is referred to the following documents which are the work of, and available from the W3C (World Wide Web consortium): XML Schema Part 2: Datatypes; Extensible Markup Language (XML) 1.0 second edition specification; XML Schema Part 1: Structures; and XSL Transformations (XSLT) Version 1.0.
Before data can be transferred, however, it must first be collected. Electronic forms are commonly used to collect data. Electronic forms collect data through data-entry fields, each of which typically allows a user to enter data. Once the data is received, it can be stored in an XML data file. The data from a particular data-entry field typically is stored in a particular node of the XML data file.
Users often enter invalid data into data-entry fields, however. Invalid data, when stored in a data file, can misinform people and cause unexpected behavior in software relying on the data file. Because of this, businesses and individuals expend extensive time and effort to prevent invalid data from making its way into XML data files.
One such way to help prevent invalid data from corrupting an XML data file is to validate the data before the data file is saved or submitted. By validating the data file before it is saved or submitted, invalid data can be corrected before it is permanently stored in the data file or used by another application. Validation typically is performed when a user attempts to submit or save the entire form, and is thus performed on a group of individual data fields at one time.
One of the problems with this manner of validating data is that the user receives a list of errors disjointed from the data-entry fields from which the errors arise. These errors may be difficult to relate back to the data-entry fields in the electronic form, requiring users to hunt through the data-entry fields to find which error from the list relates to which data-entry field in the electronic form.
Another problem with this manner is that even after the user determines which error from the list relates to which data-entry field, the user may have to expend a lot of effort to fix the error if the error notification is received well after the user has moved on. Assume, for example, that the user has entered data from a 400-page source document into ninety-three data-entry fields. Assume also that once finished, the user attempts to save or submit the electronic form. A validation application then notifies the user of sixteen errors. After finding that the first error relates to the eleventh data-entry field out of ninety-three, the user will have to go back through the 400-page document to find the data that he or she was supposed to correctly enter into the eleventh data-entry field. This manner of validation can require extensive hunting through large or numerous source documents to fix old errors, wasting users' time.
Even worse, the validation application may return only the first of many errors. For this type of validation application, a user has to go back and fix the first error and then re-save or re-submit. If there are many errors in the electronic form—as is often the case—the user must go back and fix each one separately before re-saving or re-submitting to find the next error. If there are even a few errors, this process can take a lot of time.
Another problem with this process is that if the user submits the electronic form to a server, it taxes the server. A server can be slowed down by having to validate electronic forms, reducing a server's ability to perform other important tasks.
In addition to these problems, the current way of validating data for structured data files can allow some data that is not desired. While this allowance of undesired data can sometimes be prevented, doing so can require extensive time and sophisticated programming abilities.
For these reasons, validation of data for XML data files can require a lot of a data-entry user's time and tax servers. In addition, without a skilled programmer expending considerable effort, significant amounts of undesired data can get through.