1. Technical Field
The present disclosure relates to data processing systems and more specifically to comparing of very large XML data.
2. Related Art
XML (eXtensible Markup Language) refers to a language/specification used for describing information/data. XML enables users to specify desired characteristics of the data being described in the form of associated elements (typically containing corresponding start and end tags) and/or attributes. Thus, XML data generally contains elements (or attributes) set equal to corresponding desired data values to describe the data.
The user may specify any desired elements and attributes (and combination thereof) for describing the data. Such a feature is in sharp contrast to other markup languages such as HTML (hypertext markup language), where the elements/attributes are generally pre-defined, and a user is enabled only to associate the data being described with the desired pre-defined elements/attributes.
There is often a need to compare XML data, for example, to determine whether two versions/copies of an XML data are matching with each other. At least due to factors such as optional elements/attributes, absence of set limits on the number of levels of indention, the amount of data that may be associated with individual elements/attributes, challenges are presented in comparing XML data.
The challenges are compounded as the size of the XML data becomes very large.
In the drawings, like reference numbers generally indicate identical, functionally similar, and/or structurally similar elements. The drawing in which an element first appears is indicated by the leftmost digit(s) in the corresponding reference number.