1. Technical Field
The present invention generally relates to tools for developing software for international use and in particular to multi-language software development. Still more particularly, the present invention relates to a system for testing language translatability in computer software.
2. Description of the Related Art
As computers have become more prevalent, it has become desirable for software developers to be able to market their products to those people who do not speak the native language of the software developers. In particular, it is desirable that software developed in the English language be available to those persons, both in the United States and in the rest of the world, that do not speak English. Accordingly, many software applications that are developed in English are later translated for use by non-English speakers.
The process of translating a software package into another (or more than one other) language is time-consuming and expensive. Each text message, menu, and button must be translated to allow the user to operate the program. The most direct way to do this is to search the entire program source code for every text string, i.e., every string of characters that would be displayed to the user, and translate each of these to the new language.
This approach has several problems. One problem is that the use of this method means that the software must be specifically translated and compiled for each intended language. This, of course, is an expensive process in itself, and means that any change in the source code requires each language version of the code to be edited and recompiled.
One solution to this problem is the use of separate localization files, in which the text strings that are to be displayed are stored separately from the executable code itself. As the software is executed, the text for every given display screen is simply read from the localization files, in whichever language is stored in the file. In this manner, the text in the localization file can be translated without disturbing the executable, and the executable can be changed or replaced without disturbing the translated text (except, of course, that if the text to be displayed changes, the corresponding entries in the localization files must also be changed). The localization files may be in any number of formats, including compiled message catalogs, Java resource files, HTML bundles, and many others.
However the translation is handled, each screen of the program in operation must then be proofread to ensure that the translated text properly fits the display in place of the original text. Because different languages require different numbers of letters and spaces to express corresponding ideas, it is possible that the translated text will be truncated or misaligned when put in place of the original text. The programmer, who probably only speaks her native language, would be unable to reliably proof-read the translated display to ensure that the translated results are displayed properly. Therefore, it has become common practice to hire individuals with backgrounds in other languages to proofread each screen of the translated program, in each language, to be sure that the translated text isn""t truncated, missing, or otherwise misformatted. These errors, of course, would not be readily apparent to one that did not speak that language.
In fact, at the time the programmer is testing the software, translations are typically unavailable. The translations are usually done much later in the software development process, and the software programmer is unable, using conventional tools, to determine if the software being developed will be able to properly handle the language translations at all.
The International Business Machines Corporation has published guidelines for software design which takes into account the typical amount of xe2x80x9cextraxe2x80x9d space needed to display the translation of an English word or phrase of given length; see IBM National Language Design Guide: Designing Internationalized Products (IBM, 4th Ed. 1996), which is hereby incorporated by reference. By following these guidelines, generally programmers are able to design screen displays with sufficient extra display space so that when another language is used (preferably by reading entries in a localization file), it will display correctly.
Even using these guidelines, it would be desirable to provide a system to allow a programmer to examine each screen for possible internationalization problems without requiring the participation of those fluent in the foreign languages.
It is therefore one object of the present invention to provide an improved tool for developing software for international use.
It is another object of the present invention to provide an improved tool for multi-language software development.
It is yet another object of the present invention to provide an improved system for testing language translatability in computer software.
The foregoing objects are achieved as is now described. A mock translation system, method and program is provided which converts single-byte base-language data, which is United States English in the preferred embodiment, and performs a mock translation on it to produce internationalization test data which takes the form of the corresponding base-language data transliterated into and displayed using a multi-byte character set, thereby creating a double-wide English character. The double-wide characters increase the spacing, i.e., field length, typically needed for translation of the text into a different language. This data is stored in localization files and displayed in a software application in place of the English or foreign-language text. By visually inspecting each screen, the programmer or proofreader is able to easily recognize many internationalization errors, without requiring the ability to read any foreign languages. These errors include display problems on the GUI or command line, such as truncation, expansion, alignment, or other formatting errors, and programming errors such as text that is hard-coded, localization files missing from the program build, or text missing from localization files. The use of multi-byte characters identifies build process problems and localization file retrieval mechanism problems in addition to presentation software problems such as fonts and those previously mentioned.
The mock translation provides an environment for testing the software that will be run using multi-byte character sets, such as are used by Asian countries. For ease of reference, multi-byte character sets will be discussed as xe2x80x9cdouble-bytexe2x80x9d characters and character sets, but those of skill in the art will recognize that these teachings apply to any character sets which use more than one byte to represent a single character. The translation of English (single-byte) software into Asian languages often results in many problems specific to double-byte text. By transliterating English text into double-byte English characters that look like wide English characters, problems can be identified by an English-speaker that would have occurred after the actual translation into an Asian language. Thus, the mock translation system tests the translatability of computer software, and aids in the internationalization of software programs.
The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.