Computer-based data systems, such as relational database management systems, typically organize data sets according to a fixed structure of tables and relationships. The structure may be described using an ontology, embodied in a database schema, comprising a data model that is used to represent the structure and reason about objects in the structure.
An ontology is a fixed data structure that defines a set of tables and relationships between those tables. Ontologies therefore define a set of concepts and relationships that represent the content and structure of a data set embodied as a database schema. Thus, an ontology of a database is normally fixed at the time that the database is created; any change to the ontology represented by the schema is extremely disruptive to the database system, and may require user intervention by an administrator to modify tables or relationships, or to create new tables or relationships.
However, the volumes of data in high-scale datasets are cumbersome to store, manage and curate in a structure described using an ontology. High-scale data sets may include sensor data received from one or more sensors. Sensor data is generally presented as a time series, such that each data point from the sensor includes a timestamp as well as metadata that may indicate a source of the data point (e.g., an identifier of a sensor). Sensor data is generally collected in real-time, over extended periods of time. As a result, high-scale data sets may comprise growing volumes of data that require vast storage space.
The inflexibility of a typical database ontology therefore presents a set of unique technical challenges when attempts are made to curate ontologies based on specifications and requirements and when incorporating high-scale datasets.