EXtensible Markup Language (hereinafter referred to as XML) is designed to transport and store data. XML has gained importance as a standard for data encoding and exchange over internet. With the increase in XML applications such as e-business transactions, XML middleware systems, effective and efficient delivery of XML data has also become important in recent times. Further, in recent years XML has gained popularity for representing semi-structured data as more and more data in many business areas are storing, and managing data in textual, image and multimedia formats. The businesses include healthcare industry, scientific data management and analysis industry, pharmaceutical industry and retail industry.
Analysis of XML data has, gained importance for business analytics in the data of variety of industries for evolving business decisions and strategies for example, such as, forecasting, prediction, trend analysis and resource management.
Normally, pre-processing and post-processing are desired steps for mining/analyzing XML data. One such common pre-processing step is to convert XML data format to relational data format, and subsequently use conventional data analytics tools to gain insights in the XML data.
Another conventional way to mine XML data is to use an XML mining tool directly on the XML data. In such methodologies, ‘XQuery’ is used as a mechanism to mine association rules from XML data. XQuery is a query and functional programming language that is designed to query collections of XML data. XQuery 1.0 was developed by the XML Query working group of the W3C, the World Wide Web Consortium (W3C), which is the main international standards organization for the World Wide Web (abbreviated WWW or W3).