1. Technical Field
This invention relates to categorizing data in an XSD document, and parsing the data based upon the categories assigned thereto. More specifically, the invention relates to managing data in a related XML document by creating separate XML documents, with the separate XML documents being directly related to the categorized data in the XSD document.
2. Description of the Prior Art
As increasingly more information becomes available online, automated tools for publishing information in a variety of formats become increasingly important. One such tool for imposing structure on information that is commonly employed is the extensible markup language, also known as XML. XML (Extensible Markup Language) is a flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere. It is a human readable way of describing structured data. For example, computer makers might agree on a standard or common way to describe the information about a computer product (processor speed, memory size, and so forth) and then describe the product information format with XML. Such a standard way of describing data would enable a user to send an intelligent agent (a program) to each computer maker's Web site, gather data, and then make a valid comparison. XML can be used by any individual or group of individuals or companies that wants to share information in a consistent fashion.
XML is similar to the language of today's Web pages, the Hypertext Markup Language (HTML). Both XML and HTML contain markup symbols to describe the contents of a page or file. HTML, however, describes the content of a Web page (mainly text and graphic images) only in terms of how it is to be displayed and interacted with. For example, the letter “p” placed within markup tags starts a new paragraph. XML describes the content in terms of what data is being described. More specifically XML allows designers to create their own customized tags, enabling the definition, transmission, validation, and interpretation of data between applications. In another example, the word “phonenum” placed within markup tags could indicate that the data that followed was a phone number. This means that an XML file can be processed purely as data by a program or it can be stored with similar data on another computer or, like an HTML file, it can be displayed. For example, depending on how the application in the receiving computer wanted to handle the phone number, it could be stored, displayed, or dialed.
XML is “extensible” because, unlike HTML, the markup symbols are unlimited and self-defining. XML is actually a simpler and easier-to-use subset of the Standard Generalized Markup Language (SGML), the standard for how to create a document structure. It is expected that HTML and XML will be used together in many Web applications. XML markup, for example, may appear within an HTML page.
XML Schema Definition, known as XSD, specifies how to formally describe the elements in an Extensible Markup Language (XML) document. XSD is written in XML. This description can be used to verify that each item of content in a document adheres to the description of the element in which the content is to be placed. In general, a schema is an abstract representation of an object's characteristics and relationship to other objects. An XML schema represents the interrelationship between the attributes and elements of an XML object, such as a document or a portion of a document. To create a schema for a document, you analyze its structure, defining each structural element as you encounter it. For example, within a schema for a document describing a Web site, you would define a Web site element, a Web page element, and other elements that describe possible content divisions within any page on that site. Just as in XML and HTML, elements are defined within a set of tags.
There is a growing need for applications to process and maintain semi-structured hierarchical data governed by a flexible data model. XML is the technology of choice used to achieve this need. XML is often used with XSD, wherein XSD is used to validate XML documents, enforce a certain structure, and validate individual data elements in the XML document. However, with the advent of globalization, there is also a growing need to maintain localized values for data in a plurality of languages. One solution known in the art is to store localized values side-by-side in the original XML document. Storing values side-by-side increases the size of the document, and also incurs changes to the format of the data. Accordingly, this solution results in bloating the original document with the localized values, thereby increasing the size of the original document which affects performance associated with processing, as well as modifying the structure of the original document.
Another known solution for addressing maintenance of a plurality of localized values is to assign unique identifiers to the data values in the XML document and have a separate file, also known as a flat file, to store the identifier and the associated value file. The flat file is not an XML structured document. Therefore the values stored in the flat file cannot be validated using XSD validation. Accordingly, there are limitations associated with the prior art solutions for addressing assignment and storage of a plurality of localized values in an XML structured document.
Therefore, there is a need for a solution that maintains a plurality of localized values in an XML structured document, while mitigated the act of bloating the original document with localized values. The solution needs to maintain the benefits associated with the XML structure as the original document is an XML document. In other words, the solution needs to maintain the benefits of the XML structure while addressing the needs for the localized values.