The present invention relates generally to the field of question answer systems, and more particularly to document corpora for question answer systems.
Question answering is a form of information retrieval and natural language processing. Question answering systems automatically answer questions posed by humans in a natural language. Question answering systems may be closed-domain, where the question answering system answers or attempts to answer questions under a specific domain, or topic, such as medicine or baseball. Alternatively, question answering system may be open-domain, where the question answering system answers or attempts to answer questions dealing with any topic. Question answering systems may answer questions by accessing structured and unstructured collections of natural language documents, known as a document corpus. Computer programmers continue to face difficulties when building document corpora and determining if the document corpus for any question answering system is sufficient.