1. Field of the Invention
The present invention relates in general to a method for comparing/discriminating a similarity between phonetic transcriptions of a foreign word, and more particularly to a method for comparing a similarity between various phonetic transcriptions of a specific foreign word on the basis of an English pronunciation similarity comparison algorithm, which is generally used in the English-speaking world.
2. Description of the Prior Art
With various exchanges with foreign countries recently increasing, phonetic transcriptions of many foreign words have been used in Korean documents. Most of the phonetic transcriptions are concerned with proper nouns or technical terms originally expressed in English. In particular, it is common that scientific and technological fields have no choice but to employ the phonetic transcriptions, because there is no Korean translation for such English technical terms.
However, there is a severe individual difference in the phonetic transcriptions of the foreign words, thus making it difficult to retrieve Korean document texts on the basis of such phonetic transcriptions. For example, three Korean phonetic transcriptions such as “”, “” and “” may be used together with respect to an English technical term “digital”. Among these Korean phonetic transcriptions, the “” has been proposed as a standard, but the “” has actually been more frequently used and, occasionally, the “” has been used according to private views.
Because various Korean phonetic transcriptions may be used together with respect to the same foreign word as mentioned above, documents with such phonetic transcriptions may not often be retrieved unless a diversity of the phonetic transcriptions is considered in the document retrieval. In order to overcome such a problem, there has been proposed a method for grouping various Korean phonetic transcriptions expressing the same foreign word into an equivalence class, indexing the grouped equivalence class and automatically expanding it upon word query [see: Jeong, K. S., Kwon, Y. H., and Myaeng, S. H., “The Effect of a Proper Handling of Foreign and English Words in Retrieving Korean Text”, In Proceedings of the 2nd International Workshop on Information Retrieval with Asian Languages (IRAL' 97), 1997].
The creation of such a phonetic transcription equivalence class requires a method for determining whether two given phonetic transcriptions are derived from the same foreign word, namely, for comparing a similarity between the two phonetic transcriptions.
The above phonetic transcription similarity comparison method is also basically necessary to an approximate search for a phonetic transcription (foreign words) database. For example, the similarity comparison method may be usefully utilized for the search for either firm names or trademarks of words of foreign origin.
However, no method has been developed until now for similarity comparison between Korean phonetic transcriptions, because Korean words are spelled using the same phonetic symbols as their pronunciations and thus in clear connection with the pronunciations. For this reason, it is very inconvenient for the user to retrieve and manage data on the basis of phonetic transcriptions of foreign words.