Today, more and more emphasis is being placed on the dissemination of information in electronic form. Virtually all business related paper documents today are generated from a computer application and sent to a printer or a fax delivery vehicle. While present day technology has resulted in a vast amount of data and information being disseminated by electronic networks, such as computer networks, the Internet, and the like, the full information content of such communications does not provide the flexibility and utility still associated with hard paper copies of the same information. Some still prefer receiving printed information in hard copy form, which can be read, studied, and readily saved for future reference.
Unfortunately, the dissemination of printed material is cumbersome and slow as compared to the electronic transfer of information and data. Further, once information is in printed form, it is difficult to return the information and data content back to an electronic form. Therefore, a receiving party will need to either manually read and extract the relevant data content to be entered into another downstream computer system, or the documents may be machine scanned and imaged for either manual or automated conversion (through the use for example of OCR type technology) to electronic data, which then can be readily transferred. However, both manual and automated conversion and transfer are time consuming, susceptible to error, and often result in loss of portions of the original data content. In addition, OCR type forms extract of data from images is expensive and requires specific controls on the source documents to be successful. Where unconstrained forms are prevalent, OCR forms processing will still require a tremendous amount of manual intervention.
In environments where electronic files are used, there are still difficulties encountered by businesses trying to implement such files to transact business. As in many cases, a collection of different types of electronic files will comprise the transaction. One of the deficiencies of the presently known systems is that there is a lack of a single “container” that can hold various objects to better allow companies to interact or act as a general purpose electronic information delivery vehicle. Therefore, what is needed is a process to allow a disparate collection of different objects, be it electronic, or paper converted into electronic format be provided to a business or even and end user in a fashion the promotes ease of use an automation.
Accordingly, there is a need in industry to permit paper documents to be converted into electronic format while also retaining the relevant index data information. Further, it is desirable that this electronic format act as a standardized “container” for delivering information, such as documents, data and images via email, the Internet, on disk, or on CD, or in any other electronic form. The standardized container ideally would, among other functions, reproduce a copy of the original document that was converted. This standardized container would further be adaptable to a changing business environment where the source of the initial content may predominantly be paper documents to where as the industry shifts to more electronic content will still provide similar benefits. The standardized container would also provide information about its contents, commonly known as “meta-data.” This meta-data would be provided in a manner that facilitates the development of browser applications as well as automation. Therefore, the standardized container would support any spectrum of content and present that content to a user or system in a fashion that promotes its use and automation. Further a system using this container would include delivery methods for supplying both customers and business partners with instant information that is needed. In other words, a delivery object that supports both ad hoc support requests for information as well as large volume transactions is needed.
Examples of Known Document or Information Delivery Vehicles:
Adobe Acrobat:
One example of a presently available document delivery system is Adobe's Acrobat. Acrobat has its origin in the desktop publishing world as Adobe's electronic alternative to the printed page. Earlier, Adobe created one of the standard formats for preparing documents for printing (Postscript) and Acrobat was a natural extension of that concept. The Acrobat system allows a document to be converted in a standard format (a PDF file) that can be viewed on nearly any computer running the Acrobat Reader, without worries of having a particular display type installed or having special fonts installed. The key point to focus on is that the original document is converted. Therefore, the document is no longer in its original format. Therefore, Acrobat only provides a means to manipulate a rendering that replaces a printed piece of paper. Recent versions of Acrobat now provide some features like annotation to allow readers to “mark up” an Acrobat document. But beyond this, Acrobat files generally are static documents. One cannot edit or modify the document using the original source application that created the document and then return such to the Acrobat package. The Acrobat tools offer very limited capabilities to make additions or changes to an existing Acrobat file. In addition, the Acrobat format focuses on the document, but not on the data content presented in the document. It is difficult, if not impossible, for an automated system to extract useful information from such a document for use in another application.
Acrobat in essence emulates a printer. The user would operate in their native application, and instead of printing a hard copy paper output; they would select the Acrobat device. The system would then convert the document into an Adobe proprietary format. While this allows the document to be viewed and printed on any computer running the proprietary Adobe software, such as Acrobat Reader, it limits its use to viewing and printing. In other words, one cannot reopen the document in its native application. Most important, you cannot gain direct access to the data content of the source document. For example, if you have an Acrobat file generated from an Excel spreadsheet, all you have available to you is an electronic copy of what can be compared to a printed piece of paper. You cannot edit the spreadsheet formulas; recalculate the spreadsheet in any way. Most business documents contain a considerable amount of fundamental data. The reader of the document is interested in an easy way to extract that data. Acrobat will allow the reader to look at an electronic representation of the paper, but it does not facilitate the automation to extract that data. While Acrobat may allow different documents to be joined within a single Acrobat file, the access to the source file or application that created the document is lost. This Acrobat system is used as a document delivery vehicle and not a data delivery vehicle. It also does not support easily obtainable meta-data about the contents of an Acrobat file In addition; it does not support both data and document content.
Electronic Data Interchange (EDI):
Another existing standard for delivering information between businesses is EDI. This is an accepted standard method of delivering a common type of transaction such as invoice information between businesses. The largest shortcoming of EDI is that it is limited to only data. Documents, images, or other object types are not supported. Further a collection of different object types may not be created within a single EDI transaction. Therefore, its use is very limited and it cannot support environments where complex business transactions, or end-user requests are required.
ZIP Files:
An example of a packaging technique is a ZIP file. While this type of file may contain many different object types, including an EDI, Word, Acrobat, image, or any other type of file, it lacks a means to support automation. It may also be delivered using virtually any electronic delivery method. While new ZIP viewers will permit a file to be individually selected and permit it to launch the native application that created it, the basic nature of a zip file does not include a standard method or set of rules by which these objects are inserted into the ZIP file. Again, meta-data about the contents of a ZIP file is not easily obtainable, and therefore, prohibits automation. Therefore, while a ZIP file may be used as a container that may be delivered between business partners, they require specific rules to be first created by the business partners in order to permit automation. In almost all cases, true automation is not possible due to this lack of standards.
Email:
While email allows for multiple attachments, permits the recipient to open the file using the native application that created it, email suffers from several shortcomings. First, like ZIP, there are no specific rules in place as to how the attachments are added. Therefore, automation is not possible unless very specific rules are developed between specific business partners. Next, meta-data about the contents are not easily obtainable in a consistent fashion. Further prohibiting automation. Next, the sequence in which attachments are added and processed will affect their ability to be automated. In addition, the attachments are provided as individual files, and therefore, if the email must be processed by an outside system, there lacks an efficient method to keep the attachments together for further downstream processing.
HTML:
While standard browsers are available for viewing HTML, their lacks a means to have a collection of objects are contained within a single file. As a result, individual files must be separately provided on disk in a specific sequence to work. Further, automation is not easily facilitated due to the flexibility of HTML. The example of HTML's flexibility is the fact that all Internet Web sites use HTML as a basis for the design of their Web site. However, all Web sites look different. Therefore, business partners must again mutually agree on the HTML content and the relevant file location in order to automate transactions using HTML. Again, meta-data about the content is not easily extracted in a consistent format, further prohibiting automation.
XML:
XML represents the closest application of a universal delivery format. It may contain multiple objects of different types, has rules to define content and data elements, can contain Meta-data and can be used to facilitate automation. However, there lacks a standard browser to interact with an XML file, nor is there a standard packaging method to deliver a complete XML package. Therefore, while XML holds promise, and our tools and technologies use XML extensively, it lacks a delivery vehicle that may be employed in wide spread use.
As illustrated above, while there are many pieces of the required technology, there currently is a lack of a combination of these technologies to provide a delivery vehicle submitted. Accordingly, a system is needed to meet the requirements of a different set of business tasks: those of document imaging, conversion of paper documents to electronic format while retaining data content, work flow, sharing complex files, and EDI. To meet these requirements, such a system must be designed with a different focus and with different features than any system presently available. The needed system must expand the capabilities of presently available systems for the electronic delivery of documents and related information.
Additionally, the new system should allow documents to be kept in their native format. For example, one should be able to take an Excel document out of the delivery system, open it in Excel, make changes, and put the changed document back into the delivery system. The system could also provide for the direct viewing and display of many standard formats to further simplify user interaction. Additionally, there is a need for a document delivery system that can support dynamic, changing documents, and include end-user tools to allow for easy additions and changes.