1. Field of the Invention
Methods and systems consistent with the present invention relate to string matching, and more particularly, to string matching used for processing a plurality of strings that are written in different manners but share the substantially same meaning.
2. Description of the Related Art
In related art application services using metadata of multimedia files (e.g., MPEG Audio Layer-3 (MP3) files), multimedia data often needs to be classified according to information included in the metadata, for example, according to the names of artists who have produced the multimedia data or the genre of the multimedia data. Then, the classification results need to be displayed to users.
Such metadata may include expressions written in various languages, or special characters, such as spaces (‘ ’) and hyphens (‘-’). In the case of metadata produced by an ordinary user, strings which share the same meaning but are written in different languages or are written in the same language, but in different forms, may be mistaken as having different meanings.
For example, in an application program for related art MP3 players, music files can be classified according to the names of singers. In this case, a plurality of strings which all refer to the same singer, for example the Korean singer Lee Mija, may be mistaken as referring to different singers depending on whether or not they are written in Korean or English, depending on how they are spelled (e.g., ‘Lee Miza’ vs. ‘Lee Mija’), depending on whether words in the strings are each separated by a space (e.g., ‘Lee Mija’ vs. ‘Lee Mi Ja’), and depending on whether words in the strings are hyphenated (e.g., ‘Lee Mija’ vs. ‘Lee Mi-Ja’), thus causing inconvenience and imposing restrictions on the development of various application services.