The present invention relates to a database, and more specifically, to a method and apparatus for generating a mapping rule for converting relational data into RDF format data.
Semantic web is a concept proposed by the father of World Wide Web, Tim Berners-Lee in 1998. The core of the semantic web is to provide the computer with the ability to understand the association relationship between data in a document so that the computer can automatically process information on the semantic web. The vision of the semantic web is to publish and link the global data. The semantic web describes data using RDF (Resource Description Framework). The basic ideas of RDF are: (1) all objects (specific or abstract, existing or non-existing) that can be identified on Web being called as “resources”; (2) using URI (Universal Resource Identifier) to identify resources; (3) using property and property values to describe resources. The basic structure of any expression in RDF is a collection of triples, each triple consisting of a subject, a predicate and an object. The subject corresponds to the resource and is anything that can possess a URI, e.g., http://dbpedia.org/resource/China; the predicate corresponds to the property and is a resource possessing a title, e.g., author, firstname; the object corresponds to the property value and can be a character string or another resource, e.g., david or http://dbpeida.org/resource/United_States.
With the rapid development of the semantic web, more and more data providers and Web application developers publish data in RDF format, and link with other data sources to form a huge linked data network. For example, Wikipedia is published as DBpedia; IMDB dataset and GeoSpacial dataset are published as RDF format as well. Up to now, there are 61 billion triples in total on the linked data.
Not only Public Web, with the coming of enterprise 2.0, inside the enterprise, more and more data are urgently needed to link with data on Web to thereby build a better application and service. However, the existing data, especially data inside the enterprise mostly exist in the relational database, and thus we need a tool to publish the relational data as RDF data. At present, there exists a tool for publishing the relational data as RDF data, for example, D2R is the most widely used tool, and it comprises a D2R server, a D2RQ engine and a D2RQ mapping language, wherein the main function of the D2RQ mapping language is to define a mapping rule for converting the relational data into RDF format. However, the URI automatically generated based on the mapping rule is meaningless, and cannot express the features of the relational data; it is necessary to manually modify the mapping rule, and the features of the relational data can be expressed based on the modified mapping rule; while for a complex relational database, it generally comprises hundreds of mapping rules, and modifying the mapping rules generally needs much tedious manual labors.