The present application relates generally to an improved data processing apparatus and method and more specifically to mechanisms for extracting semantic relationships from table structures in electronic documents.
Natural language processing (NLP) systems, question and answer creation (Q&A) systems, and the like, utilize analysis of textual content of electronic documents to perform their various functions. For example, the Q&A system known as Watson™, available from International Business Machines (IBM) Corporation of Armonk, N.Y., analyzes unstructured textual content of electronic documents to answer questions and derive conclusions from the textual content.
While these systems work well on textual content, many times knowledge and information is presented or captured in table structures in electronic documents. Such NLP and Q&A systems cannot adequately process such table structures to glean the information and knowledge presented in these table structures.