Electronic data interchange (EDI) is the electronic exchange of business documents, such as purchase orders and invoices. EDI is widely used to facilitate business processes. For instance, documents can be sent and received immediately, and mailing costs are reduced. In addition, there is less paperwork, and improved accuracy as the result of avoiding manual data entry.
Conventional EDI is accomplished over existing communication lines between trading partners' computers. The transfer can be direct if the computers are linked directly, or asynchronously if the connection is a transient one. However, direct networks are difficult to maintain with larger numbers of trading partners. More commonly, a third-party "value-added network" (VAN) is used. VANs maintain an electronic mailbox for each trading partner that can be accessed by each other partner.
In order to ensure both the sender and recipient format data in a common mutually agreed way EDI takes place pursuant to a variety of established EDI standards. Examples of such standards (and their maintenance organizations) are the ANSI X12 standard (developed by the American National Standards Institute's Accredited Standards Committee's X12 group), the UN-EDIFACT standard (Electronic Data Interchange for Administration, Commerce, and Transport, a United Nations standard based on ANSI X12 and the Trade Data Interchange standards used in Europe), the Uniform Communications Standards ("UCS"), and TDCC (developed by the Transportation Data Coordinating Committee).
Regardless of the particular standard, all are designed around an electronic representation of a paper business document. A unique identifier code is assigned for each type of business document. For example, in the ANSI X12 standard an invoice is referred to as document number X12.2, with a transaction set identification code of 810.
The basic unit of information on the electronic document is the "data element." For the invoice, each item being invoiced would be represented by a data element. Data elements can be grouped into compound data elements, and data elements and/or compound data elements may be grouped into data segments. Data segments can be grouped into loops; and loops and/or data segments form the business document.
The standards define whether data segments are mandatory, optional, or conditional and indicate whether, how many times, and in what order a particular data segment can be repeated. For each electronic document, a field definition table exists. For each data segment, the field definition table includes a key field identifier string to indicate the data elements to be included in the data segment, the sequence of the elements, whether each element is mandatory, optional, or conditional, and the form of each element in terms of the number of characters and whether the characters are numeric or alphabetic. Similarly, field definition tables include data element identifier strings to describe individual data elements. Element identifier strings define an element's name, a reference designator, a data dictionary reference number specifying the location in a data dictionary where information on the data element can be found, a requirement designator (either mandatory, optional, or conditional), a type (such as numeric, decimal, or alphanumeric), and a length (minimum and maximum number of characters). A data element dictionary gives the content and meaning for each data element.
In general, trading partners seldom employ the same database programs and computers, and accessing and storage formats differ significantly. A structured query language (ANSI SQL) has evolved to unify the accessing method. SQL provides a standard method of data accessing independent of specific database storage formats, but this does not address structure and data differences. SQL provides a useful mechanism to illustrate the functionality of the invention and a common point of reference and understanding. FIG. 14 is a high level schematic of the steps involved in SQL processing. SQL begins with a human operator 1 or automated process creating a script of SQL instructions 2 intended to retrieve or store information in database 7. The script is executed by a conventional SQL interpreter engine 3, which in turn initiates a series of database access requests 4. The database system 5 receives and processes the requests and physically accesses the data 6 from database 7. The data is fed back to the SQL interpreter engine 3 which builds a response 8 to the SQL instructions. The resultant output is properly formatted in SQL format 10 or is input to an application 9, depending on the intended usage. It is presumed for purposes of the present patent application that the reader is familiar with the ANSI SQL programming standards and language, and its several popular commercial implementations. To help illustrate these mechanisms, FIG. 15 is a chart showing the SQL language syntax components (at top), as well as an example of a simple SQL language script for inserting data into a table.
Accessing of data via SQL methods does not address remote transfers of data from one computer to another. More is needed before the data becomes useful in this context. The data must be translated from the format of one computer system to the other so that the latter can process it accurately. Originally, translation software was developed to support a variety of private system formats. The private systems were employed by larger companies and were custom written to allow them to exchange electronic documents with selected trading partners.
Eventually, more general translation programs were developed such as TDCC (used in the transportation industry) and UCS (used in the grocery industry). These programs adapted a more flexible approach to translating documents from a variety of different systems into recognizable form. Such translators operate by extracting fixed data files from the original dataset. They manipulate the data twice, once to extract the necessary incoming data, and once to create the outgoing information and place it in the recipient data system. In extracting the inbound data, the translators converted the variable length data elements and segments in the dataset to a fixed length format that could be processed by traditional batch applications. Of course, there was no flexibility to manipulate data, rearrange data, or make up for errors in the placement of data. Often data structure clashes would occur between the data formats (a structure clash occurs when the way that the information has been distributed within the local system does not match the way it is earmarked for eventual use. These programs also required highly specific transmission standards limiting the formats by which datasets are transmitted. Most often, the sender and receiver were required to contract in advance for a tailored software program that would be dedicated to mapping between their two types of datasets. For instance, typical modern computer systems use a combination of SQL, xBase, and Btree based database storage systems. To exchange data between these systems used to require using various flatfile formats. There are four types commonly used, flatfile records, a comma delimited file, EDI formats, and SQL syntax statements (the latter incurs a very high overhead in terms of message size, and so is usually restricted to local transfers and backing up of data only). In any of the four cases, each time a new sender or receiver was added to the client list, a new translation program would need to be added to the arsenal. Of course, this becomes expensive.
Previous attempts at improving the translation process have resulted in solutions that are still cumbersome, use inefficient computer programming language syntax and structures, and still require programming staff to implement. For instance, U.S. Pat. No. 5,202,977 discloses a language based EDI translation system that receives data, executes a script to translate the data, and then transmits the data. The translation is accomplished by a script which creates two separate trees, or linked lists of storage nodes (see column 22, lines 18-26). The script populates one tree with the input data. Then, a series of assignment statements are executed to populate the second tree with the desired output data and according to the desired output format. For the reasons explained above, the fixed scripts can be cumbersome, especially for complex looping datasets. Moreover, there is no flexibility to accommodate a new data structure. In the current state of the art, existing translation systems will abort when they encounter unfamiliar data conditions. Therefore, if a new trading partner comes along a new script must be custom written prior to EDI.
It would be greatly advantageous to streamline the translation system, and to reduce it to a flexible and efficient form capable of handling diverse datasets and incongruities in the data without custom programming support, and with built-in ability to handle unfamiliar conditions.