The marriage of Natural Language Processing (NLP) with Information Retrieval (IR) has long been desired by researchers. NLP techniques are intensely used in query answering (QA) systems, indeed QA has been a playground for Artificial Intelligence since the 1960s. NLP techniques such as ontologies, syntactic parsing and information extraction techniques can be commonly found in a good QA system.
Although QA has been successful in domain specific areas and in small document collection such as TREC, large scale open-domain QA is still very difficult because the current NLP techniques are too expensive for massive databases like the internet Therefore, some commercial systems resort to simplified NLP, where, for example an attempt is made to map user queries (questions) to previously hand-picked question/answer sets. Thus, the performance of such systems is limited to their QA database.
Recently, a fast convolutional neural network approach for semantic extraction called SENNA has been described, which achieves state-of-art performance for Propbank labeling, while running hundreds of times faster than competing methods.