In a computer or other data processing system, text is typically processed using a standard encoding scheme (e.g., ASCII or Unicode) to represent each of the individual characters (e.g., a letter or a number) in a word or a number. An entire word or number, or group of words or numbers, is typically represented by a set or string of characters in a standard encoding scheme.
In an item delivery environment, character strings are employed to represent information related to items that need to be delivered, such as a piece of mail or a package. In particular, a delivery address indicating the location to which an item is to be delivered may be represented by a character string, or set of character strings. The delivery address may come from various sources: it may be read from the surface of a delivery item by an OCR system; it may come from an electronic mailing list; it may be scanned in from a paper mailing list; etc.
Regardless of the source, a word or number, and the equivalent computer representation, may have an error in it. Errors may be in the form of misspellings, typographical errors, incorrect information, incorrect words, transposed numbers, misread characters, etc. Such errors are often introduced when a word or number is entered into a computer file by a human typist, optical character recognition system, scantron reader, speech recognition system, etc.
Depending upon the end use of the computer representation of the word or number, it may be important to correct such errors. For example, delivery services strive to correct errors in the words and numbers of an address because it is very costly to return a delivery item to a sender, and returned items cause sender dissatisfaction. Yet, to deliver an item, a governmental delivery service, such as the U.S. Postal Service® (USPS®), is legally required to determine with a specified minimum degree of certainty that the digital representation of the address used to direct the delivery of an item is the valid and intended address for delivery. Other delivery services may have similar commercial requirements, because, in general, all delivery services strive to avoid delivering items to the wrong address or returning items to the sender.
Address information may be used for other purposes that require low error rates in address validation and correction processes, in addition to directing items for delivery. For example, the USPS® uses address information to determine whether a customer has filed a change-of-address (“COA”) order with the USPS® and to automatically forward a delivery item to a customer's new address when appropriate. Other delivery services may have similar systems and abilities. Other application areas, such as medical services, security services, and financial services, to name a few, also benefit from address information correction and require a high degree of certainty that the words and numbers in a digital representation, such as a character string, are the valid and intended interpretations, and that any corrections are accurate.
One example of a source of addresses that require validation and correction is a mailing list. Organizations typically use mailing lists containing the names and addresses of individuals interested in the organizations' products or services to send material to multiple recipients. Such mailing lists are typically kept in a computer-readable form, such as a text file or a database file. An organization may provide a mailing list to a delivery service, such as the U.S. Postal Service, for use in sending, for example, newsletters, periodicals, or advertising to the individuals on the mailing list. Organizations wish to avoid wasting materials and money by sending material to invalid or incorrect addresses contained in their mailing list.
It is worth noting that accurate mailing lists are valuable in their own right. For some organizations, such as specialized niche publications or charitable groups, their mailing lists may be revenue-generating assets. There are even mailing list brokers that help organizations maximize the value of their mailing lists by renting or selling them. The value of a mailing list is enhanced when the addresses on it are valid and error-free.
Accordingly, it is desirable to develop systems and methods that recognize errors in digital representations of address information, and accurately correct such errors. For many applications, it is also desirable to validate and correct address information in a speedy manner.