Extensible Markup Language (XML) is a common data language employed for various applications such as website development and other applications typically designed for the Internet. Generally, XML is considered a markup language for documents containing structured information. Structured information includes both content (words, pictures, and so forth) and some indication of what role that content plays (for example, content in a section heading has a different meaning from content in a footnote, which means something different than content in a figure caption or content in a database table, and so forth). Almost all documents have some structure. Thus, a markup language such as XML provides a mechanism to identify structures in a document, where the XML specification defines a standard way to add markup to documents. Another aspect of XML is referred to as XSD which is an XML based language that defines validation rules for XML files, where XSD can be employed for XML Schema Definition. Generally, XSD is an XML based language which implies that XSD statements are written in XML files. One important function of XSD is that it defines validation rules for XML files, meaning that XSD can be utilized to replace Document Type Definitions (DTD), which is another language for defining XML validation rules.
Since the structure of XML files and XSD definitions is defined by textual data and statements, tools for manipulating such languages have not developed along a similar path such as traditional code-based models for developing source code for example. For instance, code-based models typically operate with object classes where tools have developed over time to create desired software functionality. Although XML and XSD type declarations may have some similarity to previous code-based models and class structures, the differences with code-based models are such that XML/XSD tools over the last several years have developed according to a different path offering different types of functionality than code-based tools. One area where this difference is stark and apparent is in how files are operated upon in the XML/XSD development environment where files are processed according to a “one-file-at-a-time” format which provides substantial challenges to developers.
In one are where such challenges are encountered, a large number of XML schemas likely contain multiple XSD files. A collection of XSD files that define a single XML schema is referred to as a schema set where the larger the domain described by the schema, the larger its schema set. For example, an HL7 schema includes multiple schema sets, which can have hundreds or thousands of XSD files. As noted above, tools that developers employ to work with schemas only work with one file at a time. This makes schema set operations either impossible or very difficult to achieve.
To illustrate the single file operation and processing problem, consider searching for a string in a schema set containing a large number of files. First, the user needs to know all the files in the set. To achieve this, the user would generally start with the top file in the set and then recursively traverse down its “include” files and import statements. Then, the user would have to either search each file individually or perform a bulk “find in files” operation. Searching files individually is very time consuming, especially for large schemas such as HL7. Performing bulk “find in files” operation is also not trivial, since the files can be located in multiple folders, on multiple machines or in multiple internet locations.