1. Field of the Invention
Methods and systems consistent with the present invention relate to string matching and, more particularly, to string matching used for processing a plurality of strings that are written in different manners but share the same meaning.
2. Description of the Related Art
In application services using metadata of multimedia files (e.g., MPEG Audio Layer-3 (MP3) files), multimedia data often needs to be classified according to information included in the metadata, for example, according to the names of artists who have produced the multimedia data or the genre of the multimedia data, and then, the classification results need to be displayed to users.
Such metadata may include expressions written in various languages, and/or include special characters, such as spaces (‘ ’) and hyphens (‘-’). In the case of metadata produced by an ordinary user, strings which share the same meaning but are written in different languages and/or in different forms may be mistaken as having different meanings.
For example, in an application program for MP3 players, music files can be classified according to the names of singers. In this case, a plurality of strings all of which refer to the same singer, for example, the Korean singer, Lee Mija, may be mistaken as referring to different singers depending on various factors. For example, these factors are whether or not they are written in Korean or English, depending on how they are spelled (e.g., ‘Lee Miza’ vs. ‘Lee Mija’), whether words in the strings are each separated by a space (e.g., ‘Lee Mija’ vs. ‘Lee Mi Ja’), and whether words in the strings are hyphenated (e.g., ‘Lee Mija’ vs. ‘Lee Mi-Ja’). As a result, these factors cause inconvenience and impose restrictions on the development of various application services.