The present invention is directed to a system and method for at least partially translating data and facilitating the completion of the translation process. More in particular, the present invention is directed to a system and method for translating data which includes a source of data to be translated, a network connected to the source of data, a translation source connected to the network, and a portal system connected to the network for retrieving the data to be translated, and at least partially translating that data.
The system and method translates data by combining translation memory and machine translation, and in particular example based machine translation (EBMT).
Still further, the system and method stores source language sentences and target language sentences in the translation memory regardless of whether the sentences are matched to corresponding sentences in the other language.
Currently, there exist individual translation memory tools for use on a translator's workstation. Such translation aids analyze documents on a word-by-word basis, treating each new document as a totally new project. Such systems suffer from reduced efficiency in that they fail to take into account redundancies found in a document, similarities of one document to a previously translated document, and provide no means to enable team collaboration or involve the customer in the translation process.
There is therefore a need to provide a centralized translation database from which prior translations can be utilized to at least partially translate new documents to be translated. There is further a need to involve the translation customer in an iterative process, with intermediate approvals of translation work being obtained as a translation project progresses.
In an era where businesses are able to take advantage of a worldwide marketplace utilizing a global computer network (Internet) it is important that such businesses are able to easily solicit business in multiple languages. Therefore, many businesses desire to have their Web pages translated into multiple languages, so that they are able to solicit business in many different markets. Therefore, a system, which can upload a Web page and duplicate it in multiple languages, is highly desirable. Further, as much multiple languages is highly desirable. Further, as much of the language of anyone Web page is similar to that of other Web pages, it is further desirable to make use of the translations of previously translated Web pages to aid in the translation of other Web pages. By such an arrangement, the present invention reduces the workload of translators, whether it is translation of Web pages or other documents.
Further, it would be advantageous if a system and method could be devised that did not rely on a translation memory containing pairs of sentences in both source and target languages. It would also be advantageous to have a translation system and method minimized the reliance on a human translator to correct a translation generated by machine translation (MT).
While reducing the workload of translators by making use of translations of previously translated documents and materials is advantageous, it is desirable to further reduce the workload of translators by implementing a system whereby machine generated translation, in a target language, of a source sentence is compared to a database of human generated target sentences. In this manner if a human generated target sentence is found, the human generated target sentence can be used instead of the machine generated sentence, since the human generated target sentence is more likely to be a well-formed sentence than the machine generated sentence.
Example based machine translation (EBMT) is a more language independent approach than machine translation. Example based machine translation works on units of data smaller than the sentences utilized in machine translation. Example based machine translation uses a bilingual corpus to align not only sentences, but also phrases or even words from source language to target language. If a target sentence match a source sentence is not found, a target sentence might be built from phrases that have been already translated in different sentences stored in the translation memory. While a well defined domain example based machine translation can retrieve correct terms and phrases, it has a trouble generating well formed sentences.