1. Field
The present disclosure relates generally to data processing and computing systems, and more particularly, to a method and system for extracting structural information from a data file, e.g., metadata definition from an XML file.
2. Description of the Related Art
XML (Extensible Markup Language) is a markup language for documents containing structured information. Structured information contains both content, e.g., words, pictures, etc., and some indication of what role that content plays, for example, content in a section heading has a different meaning from content in a footnote, which has a different significance than content in a figure caption or content in a database table, etc. Almost all documents have some structure. A markup language is a mechanism to identify structures in a document. The XML specification defines a standard way to add markup to documents.
XML is fast becoming the key language for information exchange over the web. XML/XSD is self-describing and platform independent. Most Fortune™ 500 companies are already using XML for automatic processing of their invoices, billing, accounts, inventory, automatic replenishment and data movement. As applications are increasingly designed to depend upon XML, it is becoming essential to extract XML metadata (i.e., structural information concerning data stored within XML files) in order to replicate the metadata in other types of data structures.
Therefore, a need exists for techniques for extracting structural information from a data file, e.g., an XML file. A further need exists for techniques for automatic extraction of XML metadata.