Semi-structured documents do not have a formal structure, but they do contain tags or other markers to separate semantic elements and enforce hierarchies of records and fields. Extensible Markup Language (XML) documents and JavaScript Object Notation (JSON) documents are examples of semi-structured documents. Different query tools are available for semi-structured document databases. For example, XML Path Language (XPath) is a query language for selecting nodes within an XML document. Nevertheless, many database administrators and users prefer the traditional relational database model and its popular query language, Structure Query Language (SQL). Others prefer processing in a triple store. A triple store is a database for the storage and retrieval of triple entities, commonly expressed as a subject, predicate and object. A triple store is optimized for the storage and retrieval of such triples.
Like a relational database, one stores information in a triple store and retrieves it via a query language, such as SPARQL. SPARQL is a Resource Description Framework (RDF) format established by the RDF Data Access Working Group of the World Wide Web Consortium. SPARQL is an acronym derived from SPARQL Protocol and RDF Query Language. SPARQL allows for a query to comprise triple patterns, conjunctions, disjunctions and optional patterns.
In view of the foregoing, it would be desirable to provide users with multiple query option formats for data in a semi-structured database. Unfortunately, extracting relational data and triples from a semi-structured document is not easy. Therefore, there is a need to provide tools to extract such data to support multiple query formats in connection with a semi-structured document database.