This invention relates in general to computer software, and in particular to a method and system for using machine translation with content language specification.
Machine Translation (MT) (i.e., the translation of text from a first language into a second language) is the designation for a well-known but little used technology. MT is currently provided by a number of vendors such as Systran, LandH, Transparent Languages, and others etc.. On-The-Fly (OTF) MT is a unique approach to MT introduced by IBM in their product WebSphere Application Server (WAS) 3.0 via the HTTP server for static HTML and CGIs. IBM OTF MT allows automatic MT initiated in a system by configuration, user preference, control data or other reasons, none of which require synchronous, direct human intervention. The mechanism for IBM OTF MT is based on configuration settings. In the current IBM OTF MT, there are three steps necessary to configure and use it:
1) Configuration settings are set by the administrator in the HTTP or WAS server, this authorizes the fact that MT is allowed;
2) The administrator then must define that a particular set of web sites, pages or content is eligible for translation. This enables the fact that MT is allowed.
3) A user then sets their browser language setting preference. This is a standard feature of most browsers. At this point, content to be served to this user will automatically be translated.
After these steps, content that is authorized and enabled and destined for a user that indicated a preference for a particular language will be translated OTF. These are the factors that initiate MT. This is a unique approach to OTF MT. The Accept-Language field of the request-header for an HTTP request is set by the browser based on user language preferences. This approach provides a user enabled yet administrator authorized solution that is very valuable. The present invention herein describes a unique and alternate solution to providing OTF MT that is still administrator authorized but now programmatically HTTP based. As defined herein programmatic means actions that are initiated by a computer program. There is no requirement on the user to set any preference or take any action. For the present invention the user is not involved in the MT decision.
The present invention continues to be based on an administrator authorized configuration, but now employs a new use for an HTTP response as the final factor in driving MT. In particular, the Content-Language field in the entity-header of the HTTP response is used. The HTTP 1.1 specification defines the purpose of the value in this field xe2x80x98to identify and differentiate entities according to a user""s own preferred languagexe2x80x99. Its implication is that the content body is in the language defined via the Content-Language field. For example, if the body content is only appropriate to a German-literate audience then typical use of this information would be to execute some processing to avoid rendering to someone other than that target. This information is used in rendering decisions. It is proposed that this value be a determining factor in the MT decision as well. It is still used in the rendering decision, but if the content is not currently in the desired language, it will be translated to it. Prior to the response being served, appropriate MT would be initiated based on this value. In addition to its current use, a new use of this field would be defined as the target language of MT as well as what language is appropriate forrendering. Now an HTML author or a program that dynamically creates content can do so in only their language of choice while the Content-Language field set during the HTTP response creation process defines what language the content should be dynamically translated into. Known techniques for Language Guessing could be used to determine the source language. Normal use of the same Content-Language field would still be applicable in terms of its use to make target rendering decisions. This solution could be implemented in, for example, the IBM HTTP Server and/or the WebSphere Application Server or any other system that supports HTTP.