For informatization management of book fixed layout documents, it is required to structure each fixed layout document, such that structured information of each fixed layout document can be obtained to form corresponding directory data. Currently, the formed directory data of the fixed layout document may only include name strings and an initial position of each directory entry, but does not include an end position of each directory entry as well as a specific area where the name of each directory entry is located in the document.
In this regard, the current structured information of the fixed layout document are incomplete, and thus the specific initial and end positions corresponding to each directory entry cannot be determined according to the current directory data. Therefore, the specific directory entry of the fixed layout document cannot be read in a layout mode and a streaming mode. That is, it does not satisfy the diversified demands.
Accordingly, during a process of structuring each fixed layout document, more complete structured information of each fixed layout document need to be obtained. Generally, the fixed layout document may be structured manually, that is, contents of each directory entry in the fixed layout document need to be read and analyzed in order to obtain the required structured information manually. For an information management of a large number of fixed layout documents, there must be some errors due to manual limitations, such as intelligence or physical power. Furthermore, manual speed is relatively slow. Therefore, the accuracy and speed for obtaining the structured information is affected.