Embodiments of the present invention relate to a method and various devices and systems for processing an input query.
Predictive and prescriptive data analytics have gained more and more interest over the last years. One application is diagnostics of technical systems such as production plants, power generation facilities, etc. In such applications, an operator wants an advanced diagnostic system to issue a warning prior to a critical situation (prediction), and to give advice on what should be done in order to prevent the predicted critical event (prescription).
Typically, technical systems as described above can be characterized by two complementary types of data: static and dynamic. The compositional structure of the system, its history, operational parameters, relation to other systems, etc. are typically stored in databases and change relatively seldom over time. In contrast, dynamic data such as sensor measurements, event data generated by control units, and external signals such as control or pricing information typically vary at a much higher frequency. For such data, storage in a database system is an option; however, to efficiently and effectively realize early warning systems with low latency, it is beneficial to (also) process streaming data, which are generated by the producing entities (e.g., measurement devices, sensors, etc.) themselves.
A stream may be considered as data of any sort being (e.g., serially or successively) supplied by a data source, e.g., sensor. In this regard, the stream may have a time-stamped relation and it may not be defined by a particular beginning and/or end.
In most cases, the amount of data available is huge and goes beyond what could be processed by a human operator. Moreover, even static information is not stored in a single place, e.g., a single data silo, but distributed among several databases (or, for dynamic data, streams) with varying structures and access methods.
Experts operating and maintaining such technical systems are not aware of the data storage landscape, but need easy access to the heterogeneous, distributed data.
Ontology-Based Data Access (or OBDA) is a method to provide end-users with data access based on their domain terminology, based on an internal translation of domain questions into queries over the distributed data sources. The Optique project (www.optique-project.eu) strives at bringing this idea to industry-scale reality.
Existing solutions to OBDA only handle data stored in databases and cannot cope with streaming data. Hence, streaming data must therefore first stored in a database, which leads to a delay and thus to suboptimal reaction times compared to direct processing of data streams.
SPARQL is an RDF query language, i.e., a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework format (see, e.g., http://en.wikipedia.org/wiki/Sparql). STARQL is a temporal extension of SPARQL (see, e.g., “Deliverable D5.1: A Semantics for Temporal and Stream-Based Query Answering in an OBDA Context”, http://www.optique-project.eu/wp-content/uploads/2013/10/D5.1.pdf).