The use of forms for capturing and disseminating information has become ubiquitous. Often these forms have not been digitized and reside in a hard-copy format. Even if forms have been digitized and converted to electronic format, they may only support interaction via a specific electronic device such as a personal computer but may not be accessible on mobile devices. An adaptive form is an electronic form that can automatically adapt to viewing and input on a multitude of devices, each having disparate form factors such as personal computers, tablets, smartphones, etc.
Businesses and governments are undergoing a digital transformation whereby mobile occupies the primary digital strategy for all new offerings. The trend toward digital technology is driven by a host of compelling business and revenue incentives. Accordingly, organizations are required to both digitize and provide a multi-channel story. However, many existing account enrollment and service request processes remain paper based. Currently, to implement digital adaptive form technology, businesses must hire form/content authors to manually replicate current experiences and build mobile ready experiences field-by-field, which is time consuming, expensive and requires IT (“Information Technology”) skills.
The elements in a form are typically arranged in a hierarchy. For example, the document is the top-level element. Underneath the document there may be sections, which comprise the next level in the hierarchy and so on.
Fields are yet another vital form structural element. Fields may comprise a combination of a widget and a caption. Widgets are areas of a form that facilitate and prompt the entry of information by a user. Each widget may have a caption associated with it. A caption is a piece of textual or other signaling information that may assist a user in providing input in a widget. Examples of widgets may include sections and choice groups. Choice groups are a group of items that allows a user to select one or multiple items via checkboxes or radio buttons. Tables are another example of structural elements that may further comprise column headers, row headers and actual widgets in which a user may fill in information. In addition, a form will typically further contain text sections that are constructed of paragraphs, text lines and words. Even images may be embedded in a form.
One of the main problems in rapidly converting paper forms to adaptive forms is to identify the structure and semantics of form documents from an image or image-like format. Once the form structure is extracted and its hierarchical properties captured, this structural information may be utilized for various purposes such as creating an electronic adaptive form, etc.
Machine learning and deep neural networks (“DNNs”) have been applied to document structure extraction. However, due to the computational costs (e.g., memory demands and limits on efficient information propagation) of working with high resolution images, known methods for applying DNNs to document structure extraction from an image require the use of lower resolution input images. Therefore, typically an input image provided to a DNN for structure extraction is first down-sampled from a higher resolution image. While the use of lower resolution document images may solve the practical issues of reducing computational costs for performing form identification and extraction, it also imposes significant limitations on a DNN's ability to elicit very fine structure in a document. Thus, there is a need for techniques for extracting document structure from a high-resolution document image using machine learning and DNNs that can be performed in a computationally efficient and tractable manner.