For speed of communications and cost effectiveness, individuals, businesses, and other organizations frequently exchange electronic data through e-mail, the Internet, and other networks and systems. Companies increasingly rely on third-party applications on the Internet to accomplish a wide range of intended purposes, often involving the exchange of electronic documents. XML (Extensible Markup Language) is a specification widely used for such exchanges of electronic documents.
For example, business entities such as insurance agents often use third-party clearing houses to obtain information from carriers such as insurance companies. XML documents may be used for the exchange of this information among the parties. FIG. 1 illustrates an operating environment where insurance agent 1 at server 3 160 and insurance agent 2 at server 4 170 may employ a clearing house at server 2 150 to request insurance policy information for insurance carrier 1 at server 1 100 and insurance carrier 2 at server 2 180.
Server 100 can communicate with servers 150, 160, 170, and 180 via a wired or wireless link 142, a wired or wireless network 130, and wired or wireless links 144, 145, 146, and 148. The servers 100, 150, 160, 170, and 180 may be personal computers or larger computerized systems or combinations of systems.
The network 130 may be the Internet, a private LAN (Local Area Network), a wireless network, a TCP/IP (Transmission Control Protocol/Internet Protocol) network, or other communications system, and can comprise multiple elements such as gateways, routers, and switches. Links 142, 144, 145, 146, and 148 use technology appropriate for communications with network 130.
For example, insurance agent 1 at server 3 160 may use the clearing house at server 2 150 to request information about the rate for insurance coverage for a potential client with certain characteristics from insurance carrier 1 at server 1 100. The clearing house at server 2 150 prepares an XML document 200 comprising a request for insurance rates for a client with those characteristics and sends XML document 200 to server 1 100.
Business entities may also submit requests directly to carriers without using a clearing house.
Such request documents for insurance industries, from whatever source, are often created in the ACORD format, which is a type of XML format developed by the Association for Cooperative Operations Research and Development (ACORD), a nonprofit insurance association whose mission is to facilitate the development and use of standards for the insurance, reinsurance and related financial services industries. In this specification, an XML document in accord format is referred to as an ACORD document.
Validation
Before processing an XML document 200, a carrier such as insurance carrier 1 at server 1 100 typically must validate that XML document 200 complies with that carrier's rules. For example, a carrier may require that a policy's identification number contain no more than seven digits. If the field for the policy's identification number in XML document 200 contains eight digits, the carrier will typically invalidate the request and return it to the sender for error correction and resubmission.
Prior Techniques for Validation
The task of validating the data contained within an ACORD document, which may contain hundreds of lines of code, can be quite complicated. Software tools such as XML Schema can be used to validate that the structure of an ACORD document is partially correct, but the capabilities of existing schema languages have significant limitations. Also, the actual data content causes another layer of complexity that has to this point required non-trivial manual coding effort.
The use of rules engines to perform these kinds of validations presents significant advantages to the techniques mentioned above. A rule is a specification of a condition and one or more consequences of that condition. This model lends itself well to doing many checks on data, such as are done in validation. Because conditions can be expressions of arbitrary complexity, the limitations of existing schema languages can be easily overcome.
An important advantage of rules engines is that their rules can be adjusted and new rules introduced without the need for anything in the rules engines to be recompiled. As input, rules engines take data files representing sets of rules and interpret them against a set of data, which is a very efficient form for validation.
However, a rules engine still cannot process an ACORD document as efficiently as possible. One reason for this inefficiency is that a coding idiom must be supplied for the subroutine to work with the actual data. For example, consider the following XML fragment:
<XXX>  <YYY>    abc  </YYY></XXX>
To get to the data item “abc”, a programmer could construct an XPath query such as “XXX/YYY/child::text( )”. But then the programmer would have to go into the resulting Node list, get the single Node representing the text node, and get the node value, before the validation of the data could begin. The Java code to do this would look like the following code (using Jaxen, an open source tool that supports XPath queries on XML documents, as an XPath tool):                DOMXPath xpath=new DOMXPath(“XXX/YYY/child::text( )”);        List nodes=xpath.selectNodes(document);        Node node=nodes.get(0);        String nodeValue=node.getNodeValue( );        
In other words, after the four lines given above, the data “abc” is finally obtainable in the variable “NodeValue”.
To make this code easier to use, a programmer could write a subroutine like the above code, but that would take a DOM Node and an XPath expression and return the string of the corresponding text node. However, that subroutine would need to be written somewhere other than in the rules themselves, and a class would need to be defined on which to specify the method. Even then the result of this effort would not yield something that fits easily with the way rules are written, as the following example shows:                <rule name=“Policy Number Too Long”>                    <parameter identifier=“policy”>                            <groovy:class>org.w3c.dom.Node</groovy:class>                                    </parameter>            <groovy:condition>XPathUtil.getString(document,                        “//ACORD/YYY/child::text( )”).length( )&gt; 7</groovy:condition>                    <groovy:consequence>                            drools.assertObject(new com.webify.ValidationError(“TOO_LONG”))<                                    </groovy:consequence>                        </rule>        
The major disadvantage of this technique is that it does not fit easily with the way rules are written. Every time a programmer wanted to access a data item in an ACORD document with this technique, he or she would have to write a Java-looking expression, resulting in the same lengthy character sequence for the condition of every rule.
Such a process is unnecessarily laborious, time consuming, and expensive. Therefore, there is a need for a method and system that provides a more efficient method to validate that ACORD documents comply with carriers' rules.