Information is generally represented in a lexicon or language that is sufficiently rich to allow both valid and invalid content to be expressed. For example, it is possible to use the Roman alphabet to write a correct English sentence, but it is also possible to string together English words that do not obey the semantic or syntactic rules of any language, or to string together English letters in a manner that is completely unintelligible. The languages in which computer data is expressed are no exception—i.e., it is possible to write computer data that is not valid according to some set of rules.
In computer systems, much data is expressed in a hierarchical manner, such as in the form of an eXtensible Markup Language (XML) message. An XML message conforms to some schema, which essentially defines the proper syntax of some class of messages. For example, a type of message may be an “address,” and the schema for an address may require that an address include a street name, a city, a state, and a zip code. However, even a message that obeys the schema may be invalid for some substantive reason. For example, any combination of data that purports to be a street name, city, state, and zip code would satisfy the schema, but the address may still be invalid if, say, the state element is not the name of one of the United States, or if the zip code specified does not match the city/state combination.
The traditional way to do the validation is through brute force, message-specific code. The validation procedure for each message class would have to be written separately with no way in which to modify the procedure's behavior without modifying the class code itself. The problem with this technique is that any change to the substantive requirements for validation would require a change to the source code. Not only is such a change to the source code cumbersome, but it also may require in some cases that the source code be distributed to the public, so that a consumer of the validation procedure can make custom modifications to the procedure. Such distribution of source code may be undesirable.
In view of the foregoing, there is a need for a system that overcomes the drawbacks of the prior art.