The present invention relates to automatic directory assistance. In particular, the present invention relates to systems and methods for automatically pre-processing entries contained in an informational database used by an automated attendant.
In recent years, automated attendants have become very popular. Many individuals or organizations use automated attendants to automatically provide information to callers and/or to route incoming calls. An example of an automated attendant is an automated directory assistant that automatically provides a telephone number, address, etc. for a business or an individual in response to a user""s request.
Typically, a user places a call and reaches an automated directory assistant (e.g. an Interactive Voice Recognition (IVR) system) that prompts the user for desired information and searches an informational database (e.g., a white pages listings database) for the requested information. The user enters the request, for example, a name of a business or individual via a keyboard, keypad or spoken inputs. The automated attendant searches for a match in the informational database based on the user""s input and may output a voice synthesized result if a match can be found.
When offering automated directory assistance, the informational database may be used for two purposes. One purpose may be to create vocabularies and grammars for the speech recognition engine that recognizes the caller""s request and a search engine that searches for a match. The other purpose may be to generate a speech-synthesized output of the requested listing to the caller.
The information or listings contained in these informational databases may contain abbreviations, acronyms, errors, or other deviations that may prevent the search engine from recognizing the listing as well as the speech synthesizer from pronouncing the listings so that it is understood by the caller. For example, the system may not be able to recognize or pronounce the abbreviation xe2x80x9cCLD HARBR SPRNGxe2x80x9d to mean xe2x80x9cCold Harbor Springs.xe2x80x9d In another example, the speech recognition engine may not understand a caller""s request if the caller uses the abbreviation xe2x80x9cN-C-double Axe2x80x9d to mean xe2x80x9cN-C-A-A.xe2x80x9d
Additionally, directory listings are typically optimized for visual presentation, not for conversation. Thus, the word order is often reversed and acronyms are used extensively. Such deviations may further prevent the listing from being recognized. For example, the listing xe2x80x9cSmith Joe S., MDxe2x80x9d may not be recognized if the caller says xe2x80x9cDoctor Joe S. Smith.xe2x80x9d
Such deviations in the listings database and/or in the way caller""s may pronounce a requested listing may prevent the caller""s request for information from being completed automatically or may delay its completion.
One approach to solving this problem involves having an operator personally inspect each database entry individually and fine-tuning each listing. This conventional technique can be impractical when hundreds of thousands and even millions of listings are not only involved, but may also be in a continual state of flux, as is the case with telephone directory listings. Additionally, errors, abbreviations, acronyms, etc. may require intervention of an operator, which can delay the process and prevents complete automation, which is desirable.
Embodiments of the present invention concern a method and system for pre-processing entries in directory listings. An automated attendant or automated directory listings assistant may use the pre-processed entries. A first directory listings including one or more fields may be received. The one or more fields may be populated with entries including one or more symbol strings. A second directory listings including one or more fields may be received. The one or more fields of the second directory listings may be populated with entries including one or more symbol strings. Entries in the one or more fields of the first directory listings may be correlated with entries in the corresponding one or more fields of the second directory listings. Entries, in the one or more fields of the first directory listings, which do not correlate with entries in the corresponding one or more fields of the second directory listings may be identified. The identified entries may be processed using a rule set corresponding to the field in which the entry is located. Based on the rule set, a corresponding confidence level for the processed entries may be determined. The processed entries having the corresponding confidence level meeting or exceeding a threshold may be automatically modified. The automatically modified entries may be outputted for processing. In alternative embodiments of the present invention, the processed entries having the corresponding confidence level below the threshold may be marked for operator confirmation.