The present invention relates generally to the field of information retrieval, and more particularly to semantically decomposing a single document into multiple documents for parallel processing.
Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text (or other content-based) indexing. Automated information retrieval systems are used to reduce what has been called “information overload.” Information retrieval systems may be used to provide access to books, journals, and other documents. Web search engines are the most visible information retrieval applications.
Natural language processing is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. As such, natural language processing is related to the area of human-computer interaction. Many challenges in natural language processing involve natural language understanding, that is, enabling computers to derive meaning from human or natural language input, and others involve natural language generation.
An automated information system is an assembly of computer hardware, software, firmware, or any combination of these, configured to accomplish specific information-handling operations, such as: communication, computation, dissemination, processing, and storage of information. Included are computers, word processing systems, networks, or other electronic information handling systems, and associated equipment. Managing information systems are a common example of automated information systems.