1. Field of the Invention
The present invention relates to a technique for enabling improved document translation for documents available for downloading via the Internet or World Wide Web (WWW). More particularly, the present invention provides a method, computer readable code and system for providing an improved machine translation process for documents on the WWW which permit superior translations to be produced.
2. Description of the Related Art
One problem that is slowing the expansion and acceptance of the World Wide Web (WWW) and it actually becoming xe2x80x9cworld widexe2x80x9d is the problem of language. Generally, companies, individuals, governments or organizations that make documents available via the WWW make them available in a single language.
A number of language translation service companies, such as Transparent Language Inc. and Globalink Corporation, now provide Web translation software for Web users"" personal Web browsers. Transparent Language Corporation provides a tool called Easy Translator which uses dynamic data exchange (DDE) to communicate with the browser. The Globalink solution provides an add-on software component for a Web browser.
Search engines also now provide translation support for documents retrieved in a search. For example, the AltaVista search engine utilizes technology from Systran Software, Inc. to provide users with the option to provide translated versions of retrieved documents. AltaVista is a trademark of Compaq Computer Corporation. Additionally, AltaVista provides a website at http://babelfish.altavista.com/cgi-bin/translate? at which a user may enter an address for a document on the WWW, indicate the original language of the document and a desired language into which the user wants to translate the document. The translation is done remotely at the server, so the user does not need to have any translation software loaded on his or her computer.
The computer-based automated translation technique utilized by the software and systems described above is known as xe2x80x9cmachine translation.xe2x80x9d A background description of machine translation can be found at http://www.systransoft.com/FAQs.html. Machine translation software translates one natural language into another natural language. Machine translation takes into account the grammatical structure of each language and uses rules to transfer the grammatical structure of the source language into the target language.
However, given the complexities involved in languages, machine translation tends to be only about 30% to maybe 65% accurate. Many phrases and colloquial terms do not translate easily. Attempts to translate the names of towns, cities, places, etc. are made when they shouldn""t be translated. Rules which are hard-coded for certain grammatical features may always be applied, even though many exceptions to the rules exist, since writing code for all the exceptions would be a prolonged task, and make the translation process quite slow. So a document translated by current machine translation techniques may or may not even be understandable to a user; worse yet, some important elements of the document may be translated incorrectly. Machine translation cannot replace a human translator, nor is it intended to. Therefore, many of the companies that provide machine translation software also provide old-fashion human translation services to provide documents that are translated with a high degree of accuracy.
The accuracy problem with machine translation can be explained by a simple example. Using presently available machine translation, if a user was to translate a sentence from English to French, a certain degree of inaccuracy would be involved. In translating the sentence back to English using machine translation, the original translation inaccuracy is amplified, and the sentence will in most instances be different than the original English sentence.
Accordingly, a need exists for a technique which provides accurately translated WWW documents which are available for downloading.
Accordingly, an object of the present invention is to provide translated WWW documents having a high degree of accuracy.
Yet another object of the present invention is to provide technique for combining machine translation with incrementally-enabled human translation.
Another object of the invention is to enhance server performance and minimize network utilization by providing a caching technique for translated WWW documents.
Still another object of the invention is to permit the owner of the document to control the translation and to prevent translations of old versions of a document to be sent to a document requester at a client.
These and other features, aspects, and advantages of the present invention will become better understood with reference to the following description, appended claims, and accompanying drawings, in which like numbers refer to like elements throughout.
Other objects and advantages of the present invention will be set forth in part in the description and the drawings which follow, and, in part, will be obvious from the description or may be learned by practice of the invention.
To achieve the forgoing objects, and in accordance with the purpose of the invention as broadly described herein, the present invention provides computer readable code stored on media for use with a server linked to a network and which provides documents to requesters at clients, the code responding to client requests to provide a document in a language other than an original language of the document, the code comprising first subprocesses, responsive to a request from a client for a document in a language other than an original language of the document, for determining if at least one translated version of the document is available for transmitting to the client; second subprocesses for, if the first subprocesses determines that at least one translated version is available, determining which of the translated versions is newest and whether the newest translated version is newer than the document, and if so, transmitting the newest translated version to the client; and third subprocesses for causing a machine translation version of the document to be created if the first subprocesses determines that no translated version is available or if the second subprocesses determines that the newest translated version is not newer than the document, and transmitting the machine translation to the client. The first subprocesses further determines if a machine translation version of the document or a perfected translation version of the document is available. The third subprocesses further causes the created machine translation version to be named in accordance with a naming convention and stored.
The present invention also provides, in a client-server environment, a system for providing a translated document to a client which requests a document from a server in a language other than its original language, comprising means for determining that a document has been requested by a client from a server in a language other than its original language; means for determining if a machine translation version of the document is available; means for determining if an edited translated version of the document is available; means for determining which of the machine translation version and the edited translation version is most recent and whether the most recent version is newer than the document; means for transmitting the newest version to the client if the most recent version is newer than the document; and means for creating a new machine translation version of the document and sending it to the client if the most recent version is not newer than the document or if no machine translation version and no edited translation version is found to be available. The system may further comprise means for creating, in response to the request for the document in a language other than its original language, filenames of the document that would correspond to the machine translation version of the document and the edited translation version of the document in the requested language, wherein the means for determining whether the machine translation version is available and the means for determining whether the edited translation version is available determine whether copies of the created filenames are available. Further, the system may comprise means for naming the newly created machine translation version of the document with a filename that corresponds to a predetermined naming convention that identifies the language of the newly created machine translation version and that the newly created machine translation version is a machine translation version of the document, and saving the newly created machine translation document for future use. Also, the means for determining which of the machine translation version and the edited translation version is most recent and whether the most recent version is newer than the document does so by checking a timestamp associated each version and the document.
The present invention also provides a method in a client-server environment for delivering a document written in a first language from a server to a client in a requested second language, the method comprising the steps of (a) in response to a request from a client for a document written in a first language to be sent in a second language, determining if a second language version of the document is available for sending; (b) if more than one second language version of the document is found to be available, determining which of the versions has a most recent timestamp and selecting the most recent version; (c) if only one second language version was found in step (a) or for the selected second language version, determining whether the second language version has a timestamp which is more recent than a timestamp for the document; (d) if the second language version is found A technique for use in a client-server environment, such as the World Wide Web, for to have a more recent timestamp in step (c), sending the second language version to the client; and (e) if no second language version was found in step (a) or if the timestamp for the document is more recent than the timestamp of the second language version, creating a machine translation version of the document and sending the machine translation version to the client.
Step (d) may further comprise saving the machine language version for consideration with respect to future client requests for the document in the second language, and step (a) may further comprise determining whether a machine translation version of the document is available in the second language and whether a perfected version of the document is available in the second language, and wherein if both are available, step (b) further comprises determining which has the most recent timestamp.
The present invention will now be described with reference to the following drawings, in which like reference numbers denote the same element throughout.