A knowledge base (KB) is a structured, organized, and comprehensive knowledge cluster that is easy to operate and easy to use in knowledge engineering. It is a set of interlinked knowledge fragments that are stored, organized, managed, and used in a computer storage in one or several knowledge representation forms according to requirements for question answering in one or several fields.
Currently, a large quantity of knowledge resources and knowledge communities have emerged on the Internet, for example, Wikipedia, Baidu Encyclopedia (http://baike.baidu.com/), and Interactive Encyclopedia (http://www.baike.com/). From these knowledge resources, large-scale knowledge bases centering on entities and entity relations have been mined through research. In addition, there are also knowledge bases in some fields, for example, weather knowledge bases and food knowledge bases.
Building of knowledge bases experiences a process from addition by using artificial or collective intelligence to automatic acquisition oriented to the entire Internet by using machine learning and information extraction technologies. Earlier knowledge bases are built by experts manually, for example, WordNet, CYC, CCD, HowNet, and Encyclopedia of China. However, with development of information technologies, disadvantages such as small scales, a small amount of knowledge, and slow update of conventional knowledge bases built manually are exposed gradually. In addition, a certainty knowledge framework built by experts also cannot satisfy requirements for large-scale computing in a noisy environment on the Internet. This is also one of reasons why a CYC project finally fails. With fast development of Web 2.0, a large quantity of collective intelligence-based web knowledge bases including Wikipedia, Baidu Encyclopedia, and Interactive Encyclopedia emerge. Based on these network resources, many automatic and semi-automatic knowledge base building methods are used to build large-scale available knowledge bases, such as YAGO, DBpedia, and Freebase.
Based on these knowledge bases, knowledge base-based question answering systems may be built. Compared with retrieval technology-based question answering systems, the knowledge base-based question answering systems may have lower question coverage due to limited knowledge base scale, but they may have certain inference capabilities. In addition, in limited fields, a higher accuracy may be achieved. Therefore, some knowledge base-based question answering systems are developed as the times require, where some have become independent applications, and some are used as enhanced functions of an existing product, for example, Siri of Apple and Knowledge Graph of Google.
A question answering system does not require a user to break down a question into keywords. Instead, the question is submitted in a natural language form. After the question of the user is processed by the question answering system, an answer corresponding to the question of the user is quickly searched out from a knowledge base or the Internet, and then the answer instead of a related web page is directly returned to the user. Therefore, the question answering system greatly reduces use difficulties for the user, and it is more convenient and efficient than conventional search engines such as keyword search and semantic search technologies.
Evaluation campaigns of question answering over linked data (QALD) have promoted the development of the question answering system. An objective of the QALD is to convert a natural language question into a structured Simple Protocol and Resource Description Framework (RDF) Query Language (Simple Protocol and RDF Query Language, SPARQL) for large-scale structured linked data, and thereby establishing a friendly natural language query interface. Converting the natural language question into the structured SPARQL needs to depend on a conversion rule for a knowledge base. However, in the current question answering systems, all conversion rules are configured manually, which causes not only huge labor consumption, but also poor field extensibility.