Technical Field
The present invention relates generally to information processing and, in particular, to automatic attribute structural variation detection for a Not only Structured Query Language (NoSQL) database.
Description of the Related Art
Unlike semi-structured data, such as eXtensible Markup Language (XML) which is always associated with XML Schema Definition (XSD), document stores using JavaScript Object Notation (JSON) are without explicit metadata and lack of schema enforcement through the provided Application Programming Interfaces (APIs). Thus, there is a need for an effective mechanism to discover metadata from document stores in a post-processing fashion.
However, retrieving metadata from a document store is a challenging task. As there is no constraint to guarantee only one object type in one collection (equivalent to a table in relational databases), a collection may include records corresponding to more than one object type whereas a relational table persists a single object type with only one schema. Moreover, schemas (i.e., sets of attributes) of the same object type in a collection might also vary because of attribute sparseness in Not only Structured Query Language (NoSQL) databases as well as data model evolutions caused by highly interactive adoption and removal of features.