Foreign name matching is an important practical problem in information retrieval and integration. Names transliterated and translated from foreign languages may often exhibit a large number of orthographic variations. Therefore, integrating data sources with foreign names or searching for a foreign name may require intelligent name matching—the process that may determine whether different names are likely to correspond to the same entity.
There appears to have been work on approximate string matching and searching algorithms. The work appears mostly to address using edit distance in search for approximate names. There also appears to have been work on record linkage. Further, adaptive work on merging names and database records appears to attempt to learn probabilistic edit distance with affine gaps for name matching. However, the edit distance may be defined in terms of single characters, which may be unlikely to work well in general cross-cultural name matching.