(a) Field of the Invention
The invention relates to a code converter with the specific capability of converting certain predetermined combinations of two or more code words of a first coding system to a single code word of a second coding system. The invention more specifically relates to such a converter which includes means for recognizing the predetermined combinations.
In a specific embodiment, the invention relates to a code converter for converting certain Arabic language characters, presently requiring two code words, of a first coding system, for the representation thereof, to a single code word of a second coding system. More specifically, the invention relates to such a converter which includes means for recognizing the certain characters. The invention also relates to the recognizer means per se.
(b) Description of Prior Art
The machine writing and printing of the Arabic/Farsi/Urdu etc. scripts is complicated by the fact that the shape of each alphabetic character is determined by its position in the text, i.e., isolated and in a word, i.e., initial, medial and terminal. In handwritten cursive script, this presents no particular problem but in machine writing and printing a processor is required that makes the correct choice of shape of each character. This processor releases the correct shape required as soon as the one that follows it is signalled. The processor makes it possible to generate faultless text using only one key for each alphabetic character, i.e., two key, in the case of certain Arabic characters.
Methods and apparatus for such processing are taught in, for example, U.S. Pat. No. 3,938,099, Hyder, issued Feb. 10, 1976 and U.S. Pat. No. 4,145,570, Diab, issued Mar. 20, 1979.
There are also six special characters in the Arabic languages which require special attention. The six characters are illustrated as items 33 to 38 in FIG. 8 of the Diab Patent. The characters are also illustrated in FIG. 9A of the same patent. The six characters consist of an overdot and an undercharacter. The undercharacter, illustrated in FIG. 9B of the Diab patent, can stand alone as entirely different and separate characters from the special overdot characters.
In the Diab patent, codes for representing the special characters consist of two separate code words. The first of the code words in each of the special characters consists of a code word to identify the overdot. The second code word corresponds with the code word for the undercharacter of the special character. In reproducing the special character on a teleprinter, the code word for the overdot is first applied to the teleprinter whereupon the teleprinter prints the overdot and retains the carriage in a stationary position. The undercharacter is then printed so that the undercharacter will be printed below the overdot to produce the special character.
It can therefore be seen that each of the special characters requires two code words for the representation thereof. Thus, each of the special characters is not separately represented by a unique code word.
Such a pseudocode is not acceptable with the newest generation of teleprinters and data terminals which require that each character be represented by a unique, discrete and single code word. It is contemplated that such teleprinters will have text editing capability in the local memory of the teleprinters so that texts can be corrected before they are transmitted. In addition, having a discrete code for each character also simplifies tape preparation.
In addition, the growing use of computers together with teleprinters for text processing, storage and retrieval requires a consistent code representation scheme where all characters of language are treated uniformly. A code system having discrete codes for a majority of characters with six characters represented by a dot code followed by another code is both inconsistent and difficult to use in this environment.
It is therefore desirable to design a new code which is consistent and in which each character can be treated uniformly. Thus, in the new code, the six characters would be represented by a single code word. The new code would then be processed by a machine, such as described in the Hyder patent above, to determine the specific shape that each character should take depending on the position of that character in a word and the following preceding characters.
However, it is perhaps unreasonable to assume that once such a new code is designed and machines are designed to go with the code, that the whole world will immediately switch over to the new code and the new machine. It would therefore be necessary for the new machines to have the facility to handle codes transmitted by the present generation of machines. This facility could be provided by a code converter which would convert the present generation code to the new code, and which would especially convert the two word code of the special characters to the single word code.