Many organisations whose trade extends abroad desire documentation in numerous languages in order to provide the greatest possible coverage in the international marketplace. Modern communication systems such as the Internet and satellite networks span almost every corner of the globe and require ever increasing amounts of high-quality natural translation work in order to achieve full understanding between a myriad of different cultures.
As a rule of thumb, an expert human translator can translate approximately 300 words per hour, although this figure may vary according to the difficulties encountered with a particular language-pair. It would take a huge amount of manpower alone to cope with all the global translation needs of modern-day life. Clearly some assistance for human translators is needed in order for them to keep up with constantly evolving requirements and updates for countless web-pages, company brochures, government documents, and press articles, to name but a few areas of application.
With the ability to process vast amounts of information, computers naturally lend themselves to tackling the problem by way of machine translation. Various pure machine translators exist which can translate many thousands of words in a matter of seconds, but the success rates cannot be guaranteed. A human influence can be used somewhere in the machine translation process to provide the desired level of translation. Bridging the gap between purely human and purely machine translation are machine-assisted translation methods where the burden can be shared between a human translator and a computer, the human translator in such cases sometimes being referred to as a computational linguist.
It is estimated that currently only a third of current internet users are native English speakers. By 2010 it is expected that this fraction will fall to a quarter, so the need to write with international audiences in mind is growing increasingly important. Global organisations should provide communications which are consistent, regardless of which market they address or which language they communicate in, whether these communications are in the form of technical documentation, web pages or marketing collateral. Variations in such communications around the globe could cause confusion or mislead the public, which could lead to devaluation of brands or markets.
U.S. Pat. No. 6,047,299 describes a document composition supporting method and system for the support of document editing or translation using an electronic terminology dictionary which is composed such that terms in standard expression and terms in alternative spelling/expression corresponding thereto are registered in association with each other. The terminology dictionary is used to search an inputted document for terms in the document matching with terms in standard expression and terms in alternative spelling/expression registered in the terminology dictionary. For terminological standardisation of the document, the terms in the document matching with the terms in alternative spelling/expression inputted in the terminology dictionary are replaced by the corresponding terms in standard expression.
International patent application no. WO 02/29622 A1 describes a dynamic machine editing system incorporating a dynamic rules database. The dynamic database of editing rules helps to automate the editing of already-translated documents to better reflect the nuances of language content and meaning; and especially the use of nomenclature that is culture and/or industry specific. An initial set of editing rules is deployed in the database and used to edit machine-translated documents. Manual changes, which are subsequently made to the machine-edited documents by a human editor, are recorded and that data is used to form updates or additions to the initial editing rules.
Japanese patent application no. JP 2005/107597 describes a translation support system that uses an example database to search for examples of matching or similar sentences to an input sentence. A translation server which stores a number of bilingual sentence examples determines the similarity between pre-stored and input sentences based on the ratio of words in the stored sentences which match words in the input sentence. The results are then displayed to a translator for selection of a suitable pre-stored sentence whereby to assist with translation of the input sentence.
Machine-assisted translation methods, for example such as described in International patent application WO/2006 016171 A2 filed by the present applicants, still require considerable time on the part of the computational linguists involved. Any assistance that can be given to the computational linguists in their work is therefore desirable as this will lead to reductions in associated overall translation costs.
There is thus a need for a quick, efficient, easy-to-use and consistent machine-assisted natural language translation system which reduces the burden on computational linguists.