1. Field of the Invention
This invention relates generally to parsing a text string and, more specifically, to identifying a telephone number within the text string.
2. Description of Related Art
An emerging mobile device combines the functionality of a PDA (Personal Desktop Assistant) and cellular telephone. Conventional cellular telephone applications only had to facilitate telephone numbers entered into pre-defined fields. On the other hand, applications for these mobile devices make use of more robust data processing to provide numerous functionalities in addition to dialing telephone numbers.
One problem with handling multiple data types is how to accurately identify telephone numbers within the data. Telephone numbers appear in a text string with varying formats and spacing, and often are not accompanied by an embedded identifier. Furthermore, there are multiple types of similar number structures that are not telephone numbers such as IP addresses, dates, and other number formats that resemble telephone numbers. Consequently, applications attempting to identify telephone numbers often generate false positives.
A related problem in identifying telephone numbers is that many different number formats exist, particularly for international phone numbers. Some international telephone numbers can also include several combinations of characters other than numbers. Often, it is difficult to even ascertain where a phone number begins and ends.
Another problem is that different applications have varying tolerance levels for recognizing telephone numbers. An application such as a web browser is likely to display text strings from a content provider that adheres more closely to conventional formatting. However, an application such as SMS (Short Message Service) receives telephone numbers in SMS messages that are authored by individuals using less formality. These varying levels of formality are not well addressed by a single set of rules without flexibility. On one hand, low tolerance rules in an informal environment such as SMS messaging will miss many telephone numbers and cause inconvenience to the user. On the other hand, high tolerance rules in a formal environment will similarly inconvenience the user with many false positives.
What is needed is a robust parsing application that solves the above problems in identifying telephone numbers. The solution should identify telephone numbers within a text string that can contain other text, multiple data types, and/or multiple number formatting types. The solution should also provide flexibility with varying tolerance levels.