In the field of artificially intelligent computer systems capable of answering questions posed in natural language, cognitive question answering (QA) systems (such as the IBM Watson™ artificially intelligent computer system or and other natural language question answering systems) process questions posed in natural language to determine answers and associated confidence scores based on knowledge acquired by the QA system. In operation, users submit one or more questions through a front-end application user interface (UI) or application programming interface (API) to the QA system where the questions are processed to generate answers that are returned to the user(s). The QA system generates answers from an ingested knowledge base which can come from a variety of sources, including publicly available information and/or proprietary information stored on one or more servers, Internet forums, message boards, or other online discussion sites where people can hold conversations in the form of posted messages. Using the ingested information, the QA system can formulate answers using a natural language process to provide answers with associated evidence and confidence measures. However, the quality of the answer depends on the information contained in the knowledge base corpus, so it is possible that not all responses will have high confidence measures, and some may not even have the right answers due to insufficient content or nonexistent content in the knowledge base corpus. With traditional QA systems, there is no mechanism in place to understand if the ingested corpus has the relevant content when the QA system responds with very low confidence answer or cannot find the right answers. Nor are traditional QA systems able to identify and ingest new content based on user interactions to provide a good overall experience except through use of a laborious manual processes whereby a domain expert reviews and selects documents for ingestion into a corpus. Attempts to automate the ingestion of new information into the corpus, such as by using a machine or bot to post to a forum, have proven difficult due to misconceptions as to the purpose of the forum, concerns about the presence of a bot on the forum, and/or perceptions that the posts are part of an Astroturfing operation. As a result, the existing solutions for efficiently identifying and ingesting content into a corpus are extremely difficult at a practical level.