1. Technical Field
Present invention embodiments relate to Extensible Markup Language (XML) schema files, and more specifically, to decomposing a set of XML schema files into subsets. Each subset contains a root schema file and zero or more additional schema files that provide information for that root schema file, directly or indirectly.
2. Discussion of the Related Art
XML schemas are widely used to define standard document types for storing and exchanging information. An XML schema specifies types of XML documents (e.g., by constraining the content and attributes of allowed elements). Several languages exist for expressing XML schemas, including Data Type Definitions (DTD) and XML Schema Definitions (XSD).
Industry standards are often distributed as a number of XML schemas packaged in a single zip file. Some of these standards contain hundreds of XSD and Web Services Description Language (WSDL) schema files. When a user wants to import those schema files into an application and create XML parsing or composing jobs based on the imported schemas, the user has to discover the interrelationships among the schema files first. Importing the entire zip file often results in a type conflict or in an invalid schema type due to a type being overwritten. This outcome is common to many of the industry standard schemas (e.g., ACORD, IRS Tax schema, etc.).
The current practice of discovering the relationships among schema files is to use editor tools to find the XML elements “include”, “import”, and “redefine” in the schema files and then determine the relationships between the schema files manually. This approach is practicable when the XML files are simple and few. However, industry standards can be complex and can contain many XML schema files. Furthermore, a number of industry standards reuse a qualified name for different elements representing different structures. Duplicate names in different XSD can lead to invalid and unusable schema libraries that cannot be used for XML job designs.