Embodiments of the present disclosure relate to image processing, and more particularly to, a system and method for semantic textual information recognition.
Documents are defined as a collection of primitive elements that are drawn on a page at defined locations. The content of the document is saved in various formats. Such heterogeneous nature of document content can present challenges when various elements need to be extracted, edited, combined or processed.
A user can view the content of the document on a document viewing device. Such a document viewing device has no knowledge of the intended structure of the document. For example, a table is displayed as a series of lines and/or rectangles with text between lines, which the human viewer recognizes as a table. However, the document viewing device displaying the document has no indication that the text groupings have relationships to each other. Such lack of knowledge creates a problem while analysing the document using machines or automated means.
Hence, there is a need for an improved & reliable automated system for semantic textual information recognition to address the aforementioned issues.