1. Field of the Invention
The present invention relates to a system and process for improving the productivity and efficiency of human translators of natural languages.
2. Statement of the Problem
Human translators of literary and technical texts increasingly are presented with source-language documents in electronic form for translation into a target language.
The principal ways for human translators to process such texts at present include printing any such text and then translating the text from the paper copy using a word processor on a computer or using cut-and-paste methods to insert the translation in place of the source-language text on the display device of the computer, such newly translated text to be saved to disk and subsequently printed or otherwise processed, or using machine-assisted translation programs referred to as translation memories.
Producing the target-language version when the source-language version is printed is time consuming since the translator must divide his or her attention between the paper and the screen, perform possibly extensive reformatting, scroll back and forth in the document on the screen to move forward, edit previous mistakes, etc., shuffle the various pages of the printed copy as text is processed, and repeatedly determine where on the page the next segment of text to be translated is, with consequent possible skipping of text.
Using translation memories to aid in the production of the target-language version requires the purchase of additional software, learning of new work habits, and often results in very slow response because of the complexity of the translation memory files.
However, the process of translation by human translators, when working in a subject area with which they are familiar, essentially becomes a word-processing task, since the amount of research and lookup is limited. Translation of highly technical material is done by human translators for documents like patents, research papers, technical specifications, contracts, bid documents, legislation, and the like, where the experience and education of the translator are unmatched.
Furthermore, translator's with access to always-on, high-speed Internet connections now have the ability, using one or more very fast search tools, to use the Internet as a knowledge-base or free-form on-line content, essentially comprising all sites in the source language in question, and sites in both the source and target languages in question, such as on-line glossaries, whereby an unknown word or phrase can be looked up on the Internet more quickly than it can be looked up in a printed dictionary, thus placing even greater emphasis on the translator's ability to process words and deal with the other clerical tasks involved.
Moreover, the fairly recent innovation of the inclusion of a high-level programming language that operates within the context of a word processor and operates on the files created by such word processor has made the present invention possible. The present invention allows human translators to focus solely and exclusively on the translation of the text, without having to move papers, keep track of place, run other programs, perform formatting, or remember operating instructions. It is not necessary to add another program to do this, since the process operates within the context of the word processor itself, although other embodiments are possible.
Experienced human translators develop pattern-matching abilities that are used extensively in the translation process. Moreover, most state-of-the-art word processors in which the various documents that move across international borders would be composed include a shorthand feature that translators use as a built-in, high-speed translation memory, whereby the translator mentally associates the text on the screen or page in the source language with its corresponding translation, and then makes the further association from the translation to an abbreviated, shorthand version of the corresponding target text that has been previously stored and which abbreviated version can be retrieved using a keyboard shortcut or other command. Words, phrases, sentences, and paragraphs can be recalled and inserted into the text almost instantly.
The present invention leaves the translator free to translate and takes over the task of presenting units of source text for translation, allowing insertion of the corresponding translation, and moving through the document.
The problem solved by the present invention is the problem presented in having to move papers, mark place, select text, remove text, possibly reformat text, and scroll back and forth in a document. The present invention obviates the need to work from paper when the source-language text is available in electronic form, it performs the functions of selection and removal of text, it overcomes the need to possibly reformat text, it automatically scrolls through a document, and, using comparison of source- and target-language texts, it helps to prevent the translator from skipping text during the translation process, and it reduces the physically stressful use of a mouse to a minimum. The process uses little or no overhead in computing resources.
Translation prior art primarily focuses on machine translation or machine-assisted translation. Machine translation has the drawback that it is often very difficult to prove, using automated means and methods exclusively, when the source-language text has not been produced according to the rules of a constrained grammar, that the translation into a target language of such source-language text is correct, which proof could require the skills of one or more human translators. Translation memories require additional expenditures and often produce files of such size as to cause system delays.
A prior art approach is to utilize machine-translation processes that attempt, using statistical processes, databases of examples, knowledge bases, tree structures, and other processes, to directly translate text in a source language into a target language, regardless of the complexity of the source text. Systems such as this are of limited usefulness to an experienced translator. For an experienced translator, using a machine-translation system shifts the focus of the translator's work from translating to proofreading, i.e. making sure that the translation produced by the system captures the thought, intent, symbolism, etc., embodied in the source text.
Other prior art approaches utilize interactive machine-translation approaches that attempt, using statistical processes, databases of examples, knowledge bases, tree structures, and other processes, to directly translate text in a source language into a target language, regardless of the complexity of the source text, and that allow a user to edit the translations produced by the machine. Systems using this approach are of limited use to experienced translators since they require the purchase of additional software, the learning of sometimes complex procedures, and delays while the system attempts to process sentences or text that an experienced translator would handle without hesitation.
Certain prior art systems utilize network-based machine-translation approaches that provide user-to-user translations of a plurality of source languages into a plurality of target languages for users of a network. Since these approaches are used to connect one user of a network directly to another user of a network when the users write or speak in different languages, through a translation process available to the users, human translators are bypassed completely, thus these approaches are of very limited usefulness to an experienced human translator in a production environment.
Still other prior art approaches utilize machine-translation arrangements that attempt to handle specific cases that are difficult for more broadly focused machine-translation systems to deal with, such as formatting information that the system finds confusing, unrecognized words such as proper nouns, or polysemous words. Systems such as these are helpful to an experienced human translator insofar as the translator uses a machine-translation system, as discussed previously.
Machine-translation approaches are also known in the art and can be used for speech and can use the machine-translation methods to provide speaker-to-speaker translations using various combinations of voice recognition, machine-translation, speech synthesis, and addressing methods for routing the spoken signals and the corresponding translations thereof to and from the various users of a speech communications systems. Systems such as these are of limited usefulness to human translators in a production environment.
Specialized machine-translation devices that use machine-translation or translation-memory approaches in the context of a device, such as a handheld device or hearing aid, exist in the art. Given the very limited vocabulary of these devices and the fact that they are in many cases intended for use by people without knowledge of a given target language, they are of little use to an experienced human translator.
Interlingua approaches, which attempt to provide a complete translation from source to target using computing means and processes with limited user interaction, relying on preparation of the source text before translation to constrain it to a subset of source-language constructs and then representation of the source text in a symbolic, intermediate language that can then be more easily translated by machine into a plurality of target languages also exist in the art. Since these approaches essentially seek to be a replacement for the human translator and require prohibitively expensive hardware and software, they are of little or no use to the individual translator.
Translation-memory approaches that use memory to store frequently used words and phrases and that can be recalled by the user for insertion into documents being translated are known in the art. Such approaches are useful to human translators in a production environment but require additional software and learning.
Schemes exist in the prior art that are translation memory approaches that use memory to store source text and its translation to be recalled in specific cases like signage, announcements in public places, localization of computer software, and administration of distributed networks. These approaches are not useful to human translators working in a production environment.
Translation-memory approaches applied to speech that use source text and associated translations stored in memory for decomposing and translating speech input and then synthesizing speech output in a target language are also known in the art. These approaches are not useful to human translators working in a production environment.
Translation-memory devices that are used to display the translation into a target language of words and phrases in a source language stored in a memory are known in the art. These devices are generally too narrow in coverage to be useful to professional translators working in a production environment.
The present invention overcomes the shortcomings of the prior art. The present invention takes an approach different from the approach taken in the prior art, in that the present invention considers a word-processing document file as being little different from a loosely structured database file, by considering each paragraph in the document as a record, and then processing the file record by record (paragraph by paragraph), whereby the invention performs the tasks of retrieving records and presenting each record for processing until all records have been processed. The present invention eliminates the need to handle pages of source-language text during the translation process, helps the translator to avoid skipping text by comparing source- and target-language texts, and moves paragraph by paragraph through the document. The present invention removes the burden of having to mark place in a document, performs at a much greater speed than any human translator could perform the same processes, and reduces the use of the mouse or keyboard to navigate through a document and the attendant physical stress to the user.