The field of information extraction aims to extract the information content (usually in the form of entities, relations, and events) from natural language text into a database, which provides a more useful representation than the raw text. Many traditional approaches use either rules or supervised classifiers trained on labeled examples. Both of these approaches are generally used to extract a handful of predetermined relations or events, and face challenges when attempting to build systems extracting many hundreds or thousands of types of relations and events. Open information extraction systems can extract large numbers of relations, but the resulting extractions are often not standardized; as a result, paraphrases of the same information are often represented in different ways.
Many question answering systems are based on information retrieval instead of information extraction. Instead of directly querying a database for answers, they search text for candidate answer passages and rank them according to various criteria. This approach requires much more processing at query time than a simple database lookup, and faces complications when numerous answers or pieces of information are required, rather than a single answer. Because this approach lacks an explicit representation of knowledge, it also does not easily support many of the other capabilities of a database: the knowledge cannot easily be browsed or explored or combined or analyzed or reasoned about or transferred.