Along with the popularization of the Internet, the amount of the content information on the Internet increases dramatically and therefore a method for searching the content to be queried on the Internet by using keywords is also used widely. Especially non-English users represent the fastest growing group of new Internet users, who require obtaining information not only from information sources expressed in their native languages, but also from a large collection of multilingual documents. On the other hand, technologies for the Internet application globalization bring unified methodology to build multilingual Web sites to serve the visitors from the world.
Since most users prefer to search the Web in their mother tongues or it is difficult for them to express the keywords in other languages, for example, it is difficult for many users in non-English countries to express the keywords in English, which is most commonly used by the Internet content. Therefore, these users can only find limited or relative localized information based on current content match approach. To solve the problem, translation based approaches are proposed. These methods use a translation engine, which translates user queries to different languages and then submits to different search engines. The drawbacks of these solutions are obvious: first, machine translation is not as accurate as human translation, and some terms are difficult to translate to target languages that could be understood by search engines; secondly, translation based solution is difficult to be scaled with low cost and effectively, since the all queries must be first caught and translated before being submitted. Huge amount of queries will bring heavy load to the translation engine.