The invention relates to computer-assisted language translation systems and methods.
The translation of a document from one language into another often is performed by a linguist (or translator). Recently, computer systems have been used to assist linguists in translating documents. For example, some computer systems include a translation memory configured to assist in the translation of portions of a document based upon previously translated documents. A translation memory is a database that collects translations as they are performed along with the original language documents on which the collected translations are based. When one or more portions of a document being translated match portions of a previously translated document, corresponding portions of the counterpart translation language document may be used to assist a linguist in translating the document. Translation memory systems increase the translation efficiency of linguists by enabling the linguist to avoid re-translating the portions of a document that have been previously translated.
Each document to be translated contains text to be translated and formatting codes that control the way in which the text is formatted. Some prior translation memory systems separate document text from document formatting codes for matching purposes, and present only the document text fragments to linguists for translation. Such an approach, however, may result in the presentation of text fragments that lack the context that would be provided if the text were formatted properly.
The invention features systems and methods of assisting a translation of an original document from an original language into a translation language that provide enhanced opportunities to leverage previously translated documents and that provide linguists with the context needed to improve the efficiency and quality of the resulting translations. In addition, the invention features a network-based document management system that provides enhanced document and project management functionality.
In one aspect of the invention, a format structure of the original document is extracted as a tree structure of one or more nodes identifying text and formatting codes in the original document.
In another aspect of the invention, the original document is stored on a server coupled to a network, a remote user may display selected portions of the original document on a remote network terminal, and the remote user may create a translation language document on the server.
Embodiments may include one or more of the following features.
The original document format structure may be extracted by establishing parent-child relationships among formatting code nodes and text nodes, wherein a parent node identifies a formatting code that applies to document content identified by each child node subordinate to that parent node. Document content identified by all of the child nodes subordinate to a particular parent node may be simultaneously displayed.
Document content preferably is expressed in a computer-readable mark-up language.
In one embodiment, potential opportunities to leverage one or more portions of a second original-language document having a counterpart translation language document are identified to assist in translating the original document. The second original-language document preferably has an associated extracted tree structure, in which case potential leveraging opportunities are identified by identifying one or more matching portions of the tree structures extracted from the original document and the second original-language document. Potential leveraging opportunities are identified by performing a depth-first traversal through the tree structure extracted from the original document. Potential leveraging opportunities are identified by comparing document content identified by nodes of the tree structures extracted from the original document and the second original-language document. Portions of the counterpart translation language document corresponding to the one or more identified portions of the second original-language document matching corresponding portions of the original document also are identified. Identified potential leveraging opportunities are displayed.
Document content preferably is displayed in accordance with the tree structure extracted from the original document. A graphical user interface preferably is provided for simultaneously displaying on the remote network terminal user selected portions of the original document and corresponding portions of the translation language document created by the user on the server. Potential opportunities to leverage one or more portions of a previously created translation language document preferably also is displayed on the remote network terminal. One or more authorized users may create one or more modified versions of the translation language document created on the server.
The language translation system preferably is implemented as a JAVA(copyright) computer program application.
As used herein, the term xe2x80x9cdocument contentxe2x80x9d refers to all of the contents of a document, including text and formatting codes.
Among the advantages of the invention are the following.
The invention provides enhanced opportunities to leverage previously translated documents by maintaining the context provided by the formatting code nodes within a tree structure that is extracted from a document to be translated. Also, the invention provides linguists with greater context by displaying properly formatted text, thereby improving the efficiency and quality of the resulting translations. Furthermore, the invention provides enhanced document and project management capabilities by maintaining documents at a single location which is accessible by a plurality of remote users. One embodiment of the invention is implemented as a JAVA(copyright) computer program application and, therefore, users may interact with the language translation system with only a web browser and a computer network connection; a separate computer program does not have to be loaded onto a user""s network terminal.
Other features and advantages will become apparent from the following description, including the drawings and the claims.