Statement of the Technical Field
The present invention relates to the internationalization of computer software, and more particularly, to testing bi-directional character display in an application under test.
Description of the Related Art
Internationalizing computer software can be difficult and expensive. Yet, the internationalization of computer software can be critical to ensure the global success of computer software. In this regard, it has been estimated that worldwide business-to-business e-commerce will have grown to $30 billion by the early 21st century, while at the same time non-English speakers will constitute more than 50 percent of the world's online population. With more than half of the world's Internet users predicted to be non-native English speakers in the near future, going global is not merely a business advantage in the 21st century; it is a business imperative.
In the past, the process of accommodating a specific country's language, conventions, and culture was done on a more or less ad hoc basis—essentially retrofitting software to accommodate a particular locale. Merely separating the text in a user interface from one's program is not an acceptable solution, however. Even after translating software prompts, help messages, and other textual information to the target languages, one still has to address basic issues of displaying and printing characters in the target language. Challenges can arise, however, in handling languages which incorporate bi-directional script.
Bi-directional language scripts refer to text which is written from right to left, and left to right, and may also incorporate embedded numbers or segments of text in Western scripts. Bidirectional scripts generally can be found in languages spoken by more than half a billion people in the Middle East, Central and South Asia and in Africa. Prominent among these languages are Arabic, Persian (Farsi), Hebrew, and Yiddish to name a few. Notably, languages that utilize Arabic script also include special ligature, diacritic and shaping features which add a level of complexity in the display and printing of the languages that do not apply to other European and Asian languages.
Unlike many other unidirectional languages, bi-directional Arabic text is cursive and characters are generally connected one to another so that they appear hand written, even when printed. In this regard, shape refers to the way a character is positioned relative to preceding and following characters. For instance, in the Arabic language, depending upon syntax, scripts can contain from one to four shapes for each character or ligature. The possible shapes for the Arabic character can include (1) Isolated: the character is not linked to either the preceding or the following character; (2) Final: the character is linked to the preceding character but not to the following one; (3) Initial: the character is linked to the following character but not to the preceding one; and (4) Middle: the character is linked to both the preceding and following characters. In a text string, the shaping rules that govern a character, its neighbors, and its position within a word determine its presentation shape.
As more companies deploy software products worldwide, software testing must change to verify software products developed for deployment in non-English environments. In order to test the translatability of a product, use is sometimes made of pseudo translations as described in U.S. Pat. No. 6,453,462 to Meade et al. Such pseudo translations however do not allow non-speakers of a bi-directional language to be able to test the special bi-directional script handling abilities of the product under test because the pseudo translation text is either static or not displayed in a true bidirectional environment which is needed to test the script handling.
A true test of bi-directional text handling requires actual bi-directional language data because bi-directional languages often include special forms of many characters as well as justification spacers known in the Arabic language as “kashidas”. Existing techniques require non Arabic-literate testers to memorize the appearance of a standard bi-directional language text segment or compare actual text output with images of the identical text that is known to have been rendered correctly. The drawback of this approach is that it is very time consuming and the testers may not be able to detect text in the bi-directional language that is not quite correct and therefore not detect real defects. The standard text may not fit well into the user interface requiring additional standard test strings to be used. Accordingly, to detect errors in placement of a bi-directional script such as Arabic can require of the tester intense language skills not normally possessed by test staff. Of course, to do so can be expensive and restrictive as it often means that the most technically qualified staff may not possess the language proficiency necessary to properly test the application.