1. Field of the Invention
The present invention relates generally to language translation and, more particularly, to translation-memory tools.
2. Related Art
Technology-driven industries have increasingly relied upon translation, localization and related services to bring products to the global markets. The need to quickly and efficiently create foreign language versions of products has increased dramatically as global competition increases, upgrades are developed and released more frequently, and the time in which products become obsolete decreases.
Historically, the expertise of language translators, engineers and publishers have been utilized to translate a document from a source to a target language. More recently, advances in computer software and hardware have enabled the growth of processor-based language translation tools. Traditionally, two types of language translation tools generally have been available: machine-translation tools and translation-memory tools (also referred to herein as translation memory systems).
Generally, machine translation tools use natural language translation techniques to perform language translation; accordingly, they are also referred to as natural language translation tools. Machine-translation tools perform in-depth morphological, grammatical, syntactical and some semantical analysis of text written in a source language. The machine translation tool then attempts to parse the source language into a target language using extensive glossaries and a complex set of linguistic rules. However, despite the many types of machine-translation tools that so far have been developed, there are a number of limitations that have prevented machine-translation tools from being fully successful.
First, machine-translation systems are expensive to set up, operate and maintain. Furthermore machine translation typically performs below publication-grade translation, even when operating under optimal conditions. As a result, machine-translation has been proven to be effective only when used to translate very controlled input text. However, this is time consuming and expensive since providing such controlled input generally requires careful planning.
Translation memory tools are software programs that recycle existing translations provided by a human translator-operator. Conventional translation memory tools generally utilize well-known text search and replace methodologies to perform language translation. For each file in a group of files, referred to herein as a project, translation memory tools contain a database of text strings that are to be translated. The user-operator searches for a particular string throughout a text file and, for each occurrence, replaces the found string with a translated text. Generally, the translation memory tools are utilized when the input files include text having substantial duplication of text strings, such as in technical texts, or when upgrades are performed.
Although translation memory tools overcome the computational burdens of machine translation tools, there are a number of problems with translation memory tools that compromise their effectiveness in today""s rapidly changing global markets. One such drawback to conventional translation memory tools is that typically they are completely manual; that is, the operator must provide all of the target language translations. Unfortunately, the time involved in providing such translations is extensive, making it difficult to translate a document efficiently and cost-effectively. To reduce this burden, some conventional translation memory tools provide techniques to address multiple occurrences of a given text string in the file being translated. However, these systems still require the operator to manually address each occurrence of the text string. In addition, the operator must perform the same functions in each of the files in a project.
Another drawback to conventional translation memory tools is that the integrity of the translation is dependent upon each operator entry. This drawback makes such systems sensitive to inconsistent translations provided by the same user-operator over time as well as by different translator-operators. Furthermore, this drawback often yields a translated text which is either incorrect, misleading or at least inconsistent with itself.
What is needed, therefore, is a translation memory tool that accurately translates text quickly and efficiently and is not sensitive to variations in the source language or to different operators.
To overcome these and other drawbacks of conventional language translation systems, the present invention, in one embodiment, is a propagator for a translation memory system. The propagator propagates an externally produced translation of a first occurrence of a word, phrase, or other group of characters to be translated (referred to herein as a xe2x80x9ctranslatable source segmentxe2x80x9d) in a first source file to at least one occurrence of a corresponding target segment in one or more target files of a target project. The corresponding target segment corresponds with a second occurrence of the translatable source segment in any of the one or more source files. The source files may have the same file format, or two or more of the source files may have different file formats. Also, two or more source files may have character formats that are different from each other, and/or they may be written in different languages. In one implementation, the propagator propagates the externally produced translation to all corresponding target segments in the target project that correspond with the translatable source segment. Also, the propagator may propagate the externally produced translation to user-selected corresponding target segments in the target project that corresponds with the translatable source segment.
The externally produced language translation may be generated by a human translator, or by a machine translation tool. In one embodiment, the externally produced language translation is derived from a legacy file. Also, the externally produced language translation may be derived from a corresponding target segment that corresponds with the translatable source segment.
In one embodiment, the invention is a translation memory system for translating translatable source segments in one or more source files of a source project. The translation memory system includes a propagator that propagates an externally produced translation of a first occurrence of a translatable source segment in a first source file to at least one occurrence of a corresponding target segment in one or more target files of a target project. The corresponding target segment corresponds with a second occurrence of the translatable source segment in any of the one or more source files.
In one embodiment, the invention is a translation memory system for translating translatable source segments in one or more source files of a source project and storing the translated translatable source segments in corresponding target segments of one or more target files of a target project. The translation memory system includes a propagator that associates source-target pairs in a source-target database. Each source-target pair includes a translatable source segment and a corresponding target segment. The corresponding target segment corresponds with the translatable source segment, and all translatable source segments of the associated source-target pairs are sufficiently similar to each other that they may be translated by the same externally produced translation. The translatable source segments may be sufficiently similar because they are the same, or they may be morphologically similar. In one implementation the propagator propagates the externally produced translation of the translatable source segment to each corresponding target segment that is paired with the translatable source segment in one or more of the source-target pairs.
In one implementation, the associated source-target pairs are associated by a pointer that points from each associated source-target pair to a page in a pair-occurrence pointer book. Each page of the book has a second set of one or more pointers that each point back to one associated source-target pair. The translatable source segments of each associated source-target pair pointed to by each of the second set of pointers of each page are sufficiently similar to each other that they may be translated by the same externally produced translation. In one aspect, each page is separately addressable by, and/or distributable to, two or more user-translators. Advantageously, each page thus may be separately provided to, and operated upon by, groups of users in differing locations using groups of computers that need not be linked.
In one implementation, each source-target pair is stored in a record of the source-target database. Each record may have an indicator indicating whether an externally produced translation of the translatable source segment of the record is to be propagated to the corresponding target segment of the record. Also, each record may have a pointer to a page in a pair-occurrence pointer book, wherein each page has a second set of one or more pointers that each point back to one of the associated source-target-pair records. The translatable source segments of each associated source-target-pair record pointed to by each of the second set of pointers are sufficiently similar to each other that they may be translated by the same externally produced translation.
In one implementation, the propagator includes a leverager that matches at least one legacy translatable source segment in a legacy file with at least one translatable source segment in the source-target database. The leverager also generates a corresponding target segment that corresponds with the translatable source segment, such that the generated corresponding target segment is a copy of a legacy corresponding target segment that corresponds with the at least one legacy translatable source segment.
The propagator may also include an inconsistent translation resolver and propagation flag setter that searches each source-target pair pointed to by each pointer in pages of a pair-occurrence pointer book. A conflict is identified when one corresponding target segment of a source-target pair pointed to by a first pointer of one page is different than another corresponding target segment of another source-target pair pointed to by another pointer of the same page, provided, in one implementation, that user-determined propagation indicators associated with such source target pairs are enabled. The propagation indicator indicates if an externally produced translation of the translatable source segment should be propagated to the corresponding target segment.
In one embodiment, the invention is a method for translating translatable source segments in one or more source files of a source project to generate corresponding target segments of one or more target files of a target project. The method includes: (1) associating at least one source-target pair in a source-target database (wherein each source-target pair includes a translatable source segment and a corresponding target segment, the corresponding target segment corresponds with the translatable source segment, and all translatable source segments of the associated at least one source-target pairs are sufficiently similar to each other that they are translated by a same externally produced translation); and (2) propagating an externally produced translation of the translatable source segment to each corresponding target segment that is paired with the translatable source segment in one or more of the source-target pairs. In one implementation, step (1) of the method includes: (a) generating a pointer to associate the associated source-target pairs, such that the pointer points from each associated source-target pair to a page in a pair-occurrence pointer book; and (b) generating for each page one or more of a second set of pointers that each point back to one associated source-target pair, such that the translatable source segments of each associated source-target pair pointed to by each of the second set of pointers of each page are sufficiently similar to each other that they are translated by a same externally produced translation. Each page may be separately addressable by, and/or distributable to, two or more user-translators.
In one embodiment, the invention is a computer system having at least one processor and a propagator that cooperates with the at least one processor to propagate an externally produced translation of a first occurrence of a translatable source segment in a source file to at least one occurrence of a corresponding target segment in one or more target files of a target project. The corresponding target segment corresponds with a second occurrence of the translatable source segment in any of the source files.
In one embodiment, the invention is a computer system having at least one processor and a translation memory system. The translation memory system cooperates with the processor to translate translatable source segments in one or more source files of a source project and store the translated translatable source segments in corresponding target segments of one or more target files of a target project. The propagator associates at least one of source-target pairs in a source-target database, wherein each source-target pair includes a translatable source segment and a corresponding target segment. The corresponding target segment corresponds with the translatable source segment, and all translatable source segments of the associated source-target pairs are sufficiently similar to each other that they are translated by a same externally produced translation.
In one embodiment, the invention is storage media that contains software that, when executed on an appropriate computing system, performs a method for translating translatable source segments in one or more source files of a source project to generate corresponding target segments of one or more target files of a target project. The method includes: (1) associating at least one source-target pair in a source-target database, wherein each source-target pair includes a translatable source segment and a corresponding target segment (the corresponding target segment corresponds with the translatable source segment, and all translatable source segments of the associated source-target pairs are sufficiently similar to each other that they are translated by a same externally produced translation); and (2) propagating the externally produced translation of the translatable source segment to each corresponding target segment that is paired with the translatable source segment in one or more of the source-target pairs.
In one embodiment, the invention is a computer program product for use with an appropriate computing system. The computer program product includes a computer usable medium having embodied therein computer readable program code method steps for translating translatable source segments in one or more source files of a source project to generate corresponding target segments of one or more target files of a target project. The method steps include: (1) associating at least one source-target pair in a source-target database, wherein each source-target pair includes a translatable source segment and a corresponding target segment (the corresponding target segment corresponds with the translatable source segment, and all translatable source segments of the associated source-target pairs are sufficiently similar to each other that they are translated by a same externally produced translation); and (2) propagating the externally produced translation of the translatable source segment to each corresponding target segment that is paired with the translatable source segment in one or more of the source-target pairs.
In a further embodiment, the invention is a method for translating translatable source segments in one or more source files of a source project to generate corresponding target segments of one or more target files of a target project. The method includes: (1) associating translatable source segments extracted from at least one source file with corresponding target segments extracted from at least one target file having the same format, such association being based upon commonality of attributes of the segments; and (2) propagating an externally produced translation of a first occurrence of a translatable source segment to at least one occurrence of a corresponding target segment, wherein the corresponding target segment is associated in step (1) with the translatable source segment.
Advantageously, the present invention enables the memory translation system to essentially recycle existing translations (performed by human or machine translators) in projects where substantial duplication exists and when upgrades are performed. Advantageously, the invention may operate upon any known, or to be developed, type of file, file format, or character format.