1. Technical Field
The present disclosure relates generally to electronic document management, and more particularly, to batch generating links to documents automatically without user intervention based on document name and page content matching.
2. Related Art
The creation, distribution, and management of information are core functions of business. Information or content can be presented in a variety of different ways, including word processing documents, spreadsheets, graphics, photographs, engineering drawings, architectural plans, and so forth. In electronic form, these are generally referred to as documents, and may be generated and manipulated by computer software applications that are specific thereto. The workflows of creating, reviewing, and/or editing electronic documents have evolved to accommodate the specific requirements of various fields, though the need for a device-independent, resolution-independent file format led to the widespread adoption of the Portable Document Format (PDF), amongst other competing formats. Accordingly, different platforms having a wide variety of operating systems, application programs, and processing and graphic display capabilities can be accommodated regardless of the particulars of the workflow.
The PDF standard is a combination of a number of technologies, including a simplified PostScript interpreter subsystem, a font embedding subsystem, and a storage subsystem. As those having skill in the art will recognize, PostScript is a page description language for generating the layout and the graphics of a document. Further, per the requirements of the PDF storage subsystem, all elements of the document, including text, vector graphics, and raster (bitmap) graphics, collectively referred to herein as graphic elements, are encapsulated into a single file. The graphic elements are not encoded to a specific operating system, software application, or hardware, but are designed to be rendered in the same manner regardless of the specificities relating to the system writing or reading such data. The cross-platform capability of PDF aided in its widespread adoption, and is now a de facto document exchange standard. Although originally proprietary, PDF has been released as an open standard published by the International Organization for Standardization (ISO) as ISO/IEC 3200-1:2008. Currently, PDF is utilized to encode a wide variety of document types, including those composed largely of text, and those composed largely of vector and raster graphics. Because of its versatility and universality, files in the PDF format are often preferred over more particularized file formats of specific applications.
In technical fields such as engineering and architecture, one project typically involves multiple aspects with numerous professionals spanning a wide range of disciplines. The planning documents, e.g., drawings, are specific to each discipline. For example, in a building construction project, there may be one set of plans for the structural aspect, while there may be another set of plans for the heating/ventilation/air conditioning (HVAC) aspect, and another set of plans for plumbing, another set for electrical, etc. A high level of detail is necessary in the planning documents to accurately convey the specifications of the project so that it can be correctly implemented. Although the ability to zoom in and zoom out of an electronic document alleviates this issue to a certain degree, the size and the amount of information contained in any one page must nevertheless remain manageable while retaining all the necessary detail so that viewing, editing, and annotating do not require complicated inputs/interface manipulations.
In many cases, it is adequate to have the entirety of the planning document stored in a single document, though separated into multiple pages. That single document may be stored as a single file on a hierarchical file system that is organized according to directories and subdirectories. Sharing amongst participating personnel, as well as updating and maintaining the single file are thus greatly simplified. This storage arrangement may be suitable when the document is of minimal length and size. Opening larger file sizes tend to be more time consuming, and any sort of manipulation and committing of the changes more heavily taxes the computer system. Furthermore, e-mail servers typically limit the size of the attachments that can be sent and received, so auxiliary file sharing services are needed. Although there are collaboration systems that allow for concurrent changes to be made to a file, with a conventional file system, a given file is locked for editing by a single user at a time. If the document is being heavily edited and reviewed, the lack of simultaneous access is problematic.
Thus, for complex documents, discrete sections thereof may be separated into separate files. Keeping such separate files organized can be challenging, as the file naming convention dictated by the hierarchical file system is oftentimes the sole modality by which any particular document can be identified in a vast repository of files. Furthermore, with most engineering and architectural planning documents, there are extensive cross-references from one section to another, and where there are multiple files in a set of documents, extensive cross-references from one file to another. Creating and managing such cross references so that a document or a section of a document can be immediately accessed upon command is a time-consuming manual procedure that requires not only the ability to perform the technical aspects of this task, but an understanding of the subject matter so that meaningful cross-references can be created.
Accordingly, there is a need in the art for an automatic batch generation of links to documents without user intervention based on document name and page content matching.