This invention relates to a digital device, a method and a computer program according to the appended claims. More specifically, the invention relates to the field of performing searches, particularly natural language searches, on structured data, such as relational databases or the Semantic Web.
The World Wide Web (WWW) stores and administers information on the Internet in an essentially unstructured way. In order to overcome this deficiency, increasing efforts aim at structuring or classifying the information on the Internet. These efforts are run by the World Wide Web Consortium (W3C), see http://www.w3.org/, a major standard setting body for the World Wide Web. The aim of these efforts is to create a Semantic Web of (linked) data in which data is structured and queried using a common data model, see http://www.w3.org/standards/semanticweb/. One approach of a descriptive structured data model is presented by the Resource Description Framework (RDF)—http://w3.org/TR/2004/REC-rdf-concepts-20040210/—, which realizes information statements in the form of subject/predicate/object triples made about instances of specific classes. A common way of identifying resources on the Semantic Web is the employment of Uniform Resource Identifiers (URIs). A query language for RDF data has been presented with SPARQL (http://w3.org/TR/2008/REC-rdf-sparql-query-20080115/), which introduced the concept of pattern matching of graphs.
The increasing amount of structured data available on the Web increases the necessity to bridge the gap between informal natural language of human users and the formal query languages of structured databases. Natural Language Interfaces (NLIs) provide a solution for shifting the task of parsing a natural language query, generating an appropriate database query and processing the results to a machine in an automated way.
Increasing attention is paid to the issue of portability, i.e. the flexibility of NLIs with respect to its vocabularies which are employed to parse a query into a logical representation, and its knowledge sources, which represent the data corpus for retrieving search results. In this context, a promising approach is the emergence of NLIs retrieving data from multiple knowledge sources providing a variety of possible domains. Ideally, a distributed NLI would recognize what the query searches for and contact one or multiple appropriate knowledge sources in order to retrieve an answer. Furthermore, it would be preferable to employ knowledge sources for a NLI independently from their underlying database management systems.