This invention relates to keyboard input devices and particularly those for the Chinese language although it is to be understood that the principles of the invention may well be applied to other languages which use hieroglyphic or other symbols or characters rather than alphabet oriented characters.
Chinese characters are generally made up of strokes and radicals. Strokes are essentially single components whereas radicals are effectively subentities or characters. The latter embody specific meaning and are primarily used for dictionary search, because Chinese characters are classified according to their dominant radicals and their number of strokes.
Of course, a simple character may just have one or more simple strokes and more complex characters can be a single radical, or they could be a combination of radicals and strokes.
The conventional Chinese manual typewriter has a bank of character dice. It is a bulky and heavy device and therefore not readily portable. When a particular character is selected, the character die is removed from the bank, and struck onto the paper and then returned to the bank.
This is an extremely complex and difficult operation requiring a high level and range of mental and physical skills. Many thousands of character dice need to be stored in the bank if a modest and satisfactory vocabulary were to be incorporated. The operator of such a typewriter has to be extremely skilled to search and locate, and then manipulate the device to achieve a result within a meaningful time frame. The utilisation of such a manual typewriter is by and large restrictive and useful only for commercial and publishing enterprises.
Since computerisation, many forms of Chinese language computer related typing or input devices, methods or systems have been proposed or come into existence. Such devices, methods or systems are fundamentally based on the standard electronic xe2x80x9calphanumeric keyboardxe2x80x9d (Qwerty keyboard) and, in one form, use a combination of keys to activate an electronic search. Such a search may normally bring up a number of possible characters with similar homophonic or graphic properties, and from which the operator may select the one required.
Electronically, Chinese characters are fundamentally linked to two particular xe2x80x9cCharacter Setsxe2x80x9d or xe2x80x9cCharacter Codesxe2x80x9d respectively. These sets or codes are often referred to as xe2x80x9cInternal Codesxe2x80x9d () xe2x80x9cTraditionalxe2x80x9d () characters are linked to the Big-5 () Set. Whereas xe2x80x9cSimplifiedxe2x80x9d () characters are linked to the GuoBiao () commonly referred to as the GB.
The Big-5 ()character set originated in Taiwan and is made up of 13,050 xe2x80x9cTraditionalxe2x80x9d characters. The characters are arranged traditionally, that is, according to the order of strokes and radicals. Each character is given a four letter-numeral reference in descending order. These references are often referred to as the Internal Codes () of the xe2x80x9cCharacter Setxe2x80x9d.
On the other hand, the GB () character set came from the People""s Republic of China and has about 7,000 xe2x80x9cSimplifiedxe2x80x9d characters, where common words are arranged in a phonetic order, and rare words are arranged according to radical groups. It has a similar four letter-numeral internal code. While important for technical and programming applications, these codes are seldom, if ever, seriously considered as a practical basis for user input methods.
Based on similar schematic structures, the xe2x80x9cTraditionalxe2x80x9d Big-5 () and the xe2x80x9cSimplifiedxe2x80x9d GB () share many common schematic properties. But unfortunately, these common properties refer to very different objects, thus making it logically impossible for the two xe2x80x9cCharacter Setsxe2x80x9d to actively share the same environment at any given time. In other words, access to the two sets at any given time is denied. They are logically incompatible.
Moreover, in the process of character simplification and vocabulary reduction, many xe2x80x9cTraditionalxe2x80x9d characters were retained. In the interest of reduction in the size of the vocabulary and the simplification of form and structure, many simplified characters were contrived to individually replace two or more different xe2x80x9cTraditionalxe2x80x9d characters. For conversion purposes, this has resulted in the inherent difficulties of matching one xe2x80x9cCharacter Setxe2x80x9d with the other. Thus while it is possible to convert from xe2x80x9cTraditionalxe2x80x9d to xe2x80x9cSimplifiedxe2x80x9d on a many-to-one basis, it is logically impossible to do so, without human intervention, from xe2x80x9cSimplifiedxe2x80x9d to xe2x80x9cTraditionalxe2x80x9d. To put this in another way and where the two xe2x80x9cCharacter Setsxe2x80x9d are involved, it may be possible to contrive a means by which xe2x80x9cTraditionalxe2x80x9d texts can be automatically converted into xe2x80x9cSimplifiedxe2x80x9d forms. But when it comes to reversing the process, human intervention is necessary.
As stated, the two xe2x80x9cCharacter Setsxe2x80x9d are mutually incompatible. Popular opinion is that a better, more inclusive and very different coding system is needed. One such code known as Unicode has been created. Its inclusiveness extends across languages including English, Chinese, Japanese and Korean. Although its current version has problems and lacks widespread acceptance, it is nevertheless a very promising development.
The present invention has been developed in part in an endeavour to adapt a keyboard input device to any existing Internal Code, including Unicode or its successors, as well as any graphic based language.
The xe2x80x9cTraditionalxe2x80x9d() characters have been used universally until the emergence of the People""s Republic of China (PRC). Since then they remain as the standard outside the PRC. As for the PRC, the xe2x80x9cSimplifiedxe2x80x9d () characters, developed concurrently with her Pinyin () spelling method, become the norm for Mainland China. As a whole, the two are significantly different from one another. Users of the two forms have much difficulty in understanding and writing each other""s version. Furthermore, as romanisation of Chinese gained popularity over the Mandarin Phonetic symbols (), spelling systems developed along separate lines, in and outside Chinese circles, leaving behind the current legacy of diversity. For xe2x80x9cTraditionalxe2x80x9d characters, most popular dictionaries tend to follow the Thomas Wade and the Guoyeu Romatzyh () spelling systems. For xe2x80x9cSimplifiedxe2x80x9d characters, the PRC has her Pinyin ( ) instead.
Previously proposed devices and systems fall into two broad input categories, namely the xe2x80x9cconstructionxe2x80x9d and the xe2x80x9cspellingxe2x80x9d categories.
The input process of the xe2x80x9cconstructionxe2x80x9d approach involves construction graphic characters from building components of strokes and radicals, the latter being more than 220 in number. Systems have been developed to reduce the multitude of components that make up Chinese characters to manageable number so that the essential number may be represented by the keys of the Qwerty keyboard. The alphanumeric keys that identify the various proposed building components of characters, and the precise sequence that these keys. must follow in the input process is commonly referred to as the External Code () of the characters in question. It will now be clear to the reader that the External Code () is inseparably linked to the Internal Code () of the xe2x80x9cCharacter Setxe2x80x9d (Big-5 or GB).
All xe2x80x9cconstructionxe2x80x9d input methods develop their own unique External Code (). Naturally, they differ from one another in their choice and number of building components, the alphanumeric representations () thereof, and the strict order by which the building components are to be strung together.
The key arrangement and keyboard operation vary from one device, method or system to another. In the simplest form, each stroke, or stroke-form is essentially given a number or a letter of the alphabet, and depending upon the form of device, method or system, these could be from four to six different numbers or alphabets. These numbers or alphabets, or the combination of the two are then keyed in sequentially. Normally they are based on the order in which a particular character would be written, until the keystrokes are completed or would lead to what is seen as an unambiguous character, or characters showing some dominant common features.
These forms of operation need a highly skilled operator with the following basic requirements:
(a) knowledge and efficiency in the use of the xe2x80x9calphanumeric keyboardxe2x80x9d;
(b) a good knowledge and ability to use a given code;
(c) familiarity with a set of given rules which are often complex, rigid and inconsistent.
In the final stage of the process the operator often needs to make a selection of the particular character in mind from a number of presented characters.
xe2x80x9cConstructionxe2x80x9d systems and devices enjoy limited currency, success or lasting appeal. As noted above, the reasons for their short shelf life and poor appeal are obvious. With few exceptions, they require a fairly high level of Chinese literacy to carry out word analysis. They also require knowledge and skills to follow a rigid order of correct keystrokes. The user is faced with a complex, rigid and daunting barrage of rules and definitions. More often than not, to achieve a desired level, of typing speed, special External Codes need to be committed to memory. Distinctions may need to be made regarding xe2x80x9ccommonxe2x80x9d and xe2x80x9crarexe2x80x9d words so that they may be treated differently. With few exceptions, if any, there is absolutely only one way to construct any particular character. They fail to provide for marginal errors, users"" lack of familiarity with the many and different forms of variant words (current, archaic, corrupted, popular, in-use, out-of-use, printed, or hand-written forms). And for input purposes, they do not accept such variants. When typing mixed Chinese and English texts, users are required to manually and repeatedly switch between the two (Chinese and English) input methods. Often when users reach an input impasse, they have no other choice but to switch over to other input methods in order to carry on, if at all possible. Finally, having done some or all of the above, the users must look for and select the targeted word from word lists presented on the screen.
In brief, current xe2x80x9cconstructionxe2x80x9d devices, methods and systems based on the Qwerty keyboard are rigid and user-unfriendly. Prospective users are dissuaded from using them because of the skills and levels of commitment required of them.
Most marketed products are based on the xe2x80x9cSpellingxe2x80x9d approach. While it is recognised that the Pinyin () romanisation has gained ascendancy in this field, it must be borne in mind that there is no universally recognised standard of romanisation. Nor is there likely to be one in the foreseeable future.
Like its western counter-parts, romanised words can be arranged alphabetically and phonetically in descending order. Relatively, they are simple to classify, encode and manipulate. However, unlike Latin based languages Mandarin Chinese (Putonghua ) in particular is a homophonic language with four specific levels of tonal values. Though completely different in their meaning, usage or form, many Chinese words share common phonetic and tonal values.
Furthermore, the same words may change their context-specific meanings, often resulting in changes of phonetic and tonal values. On the other hand different words may have the same phonetic and tonal values whether they have the same meaning or not. Therefore one may list words which share certain common sequential letters of the alphabet, or all the letters of words. But owning to the commonly shared homophonic and tonal values, it is logically impossible to eliminate the process of selection. The process of indexing may reduce substantial difficulties. For example, it may help narrow word lists, or reduce the tasks of typing out the full words. But whatever their improved capabilities may be, by themselves or in concert, they cannot provide any absolute solution.
Various methods of indexing have given rise to various xe2x80x9cintelligent systemsxe2x80x9d or intelligent features. Indexes are established for frequently used words, used in association with one another, words used in association with terms or phrases, and words used in context with immediately preceding words, and so on. These xe2x80x9cintelligentxe2x80x9d features are incorporated into many systems or are offered as options to be turned on and off. At their best, these are helpful features only for some of the time. At their worst, they may become woeful distractions, liabilities, or down right nuisance. The truth is no xe2x80x9cintelligentxe2x80x9d systems or indexing can possibly anticipate absolutely what the user has in mind.
Speakers of Chinese as a second language have found it necessary to use something like Pinyin () to get them started. But even at a very early stage, they need to make a quick transition from the romanised to the graphic forms if they are to make any progress at all. Once the transition is made, learning takes place in the traditionally Chinese fashionxe2x80x94constant practice, hand writing exercises and word drills. Once the transition happens, confidence and competency in romanisation is often diminished through disuse and lack of practice.
For native speakers, language acquisitions begin at an early age. Though one may possibly be introduced to romanisation at a later stage of the learning process, it would hardly ever be necessary to think or operate in a romanisation environment, except for computer operations. As in the case of speakers of Chinese as a second language, romanisation inevitably suffer the same fate. Thus it is not surprising that most Chinese speakers do not know or have the confidence and competence in romanisation to be enthusiastically interested in using xe2x80x9cspellingxe2x80x9d products.
One of the serious difficulties with all the existing xe2x80x9cspellingxe2x80x9d devices, methods or systems is that it is an imported xe2x80x9cforeignxe2x80x9d phonetic system. There is no universally accepted standard of spelling and it is unlikely that such a standard will be adopted in the foreseeable future.
Another difficulty relates to the complexity of the Chinese language. It is common that users who possess a high level of Chinese literacy may not know the pronunciation, much less the correct pronunciation of a great many words, even though they may know their meaning and use absolutely. There are also many instances when they may not know the numerous variant forms of the same words. As we have noted earlier, such variant forms extend over a range of current, archaic, corrupted, popular, in-use, out-of-use, printed and handwritten forms. A search of the dictionary may not necessarily resolve the difficulty because of the differing spelling systems and such systems are based either on a foreign language alphabet (English) or the Mandarin Phonetic alphabet (). With reference to the latter, comparatively few are familiar with it.
The fundamental difficulty with the xe2x80x9cspellingxe2x80x9d system is that when the user is not able to spell a given word, or spell it correctly according to the spelling system in use, he would find it difficult to proceed. To spell properly, the user needs to know the correct and standard pronunciation or words, distinguish subtle differences in phonetic and accent, deliberate on linguistic, geographical and subjective cultural differences, consult different spelling systems and dictionaries, and so on. Thus, ultimately, if it is not possible to pronounce a word perfectly and correctly, an impasse is reached.
The present invention attempts to address at least some of the fundamental problems mentioned above. The outcome lies not in the incorporation of foreign elements into the system or the acceptance of the Qwerty keyboard as the ultimate tool. Its approach is based on the nature of the Chinese character itself, and, in particular forms, the invention makes the prior art, alphabet-oriented spelling approach redundant.
Speech Recognition and Writing Pad
Great advances are being made in the area of speech recognition and electronic writing pads. However, as practical and efficient input methods, they are still a very long way away from displacing, if ever, the generic need for a keyboard. However, they do have practical and useful applications. Furthermore, almost all the above mentioned problems faced by xe2x80x9cspelling systemsxe2x80x9d apply equally, if not more, to speech recognition.
As discussed herein, it can be seen that proposals hitherto are targeted at specific minority groups. Proposals utilising the xe2x80x9cconstructionxe2x80x9d and xe2x80x9cspellingxe2x80x9d methods have serious limitations and are not easy to use. Without exception, they are totally reliant on the Qwerty keyboard. It is considered that such restrictive dependence is their common, most serious and fundamental shortcoming. The Qwerty keyboard evolved from the specific nature of the English language that is fundamentally and generically different from and far less complex than Chinese. Thus, attempting to fit Chinese into an English model must lead to difficulties.
The mass market would not be better served by a proliferation of more of the same kind, or improved versions of what are already in the market. A solution lies in decisively moving away from a slavish dependence on the Qwerty keyboard.
Accordingly, it is desirable to provide a keyboard, and in one particular embodiment, a keyboard for use with the Chinese language, that substantially overcomes the restrictions and difficulties of existent input devices, methods or systems set out hereinbefore.
It is also desirable to provide a keyboard that can be readily and flexibly operated by an operator without the necessary prerequisites of high degrees of skill and knowledge of the system and respective languages as is required by other input devices.
It is also desirable to provide a keyboard input which is able to be used in a method of identifying characters.
According to one aspect of the invention there is provided a keyboard for inputting graphical indicium representations of language characters formed from one or a combination of character units, the keyboard having a plurality of keys, each key having at least one unit associated therewith, the units all being different and each forming at least a part of a character, the keys being so arranged on the keyboard that visually similar units are associated with the same or adjacent keys, a key mapping for each key whereby selection of a key generates a mapped value of the unit associated with the selected key, the mapped value having a relational correlation with one or more relevant characters, and selection of one or more further keys which are associated with the same or other units provide further mapped values consistent with at least one of the relevant characters such that a character that is unique to the selected key or combination of keys is determined.
According to another aspect of the invention there is provided a method of constructing graphical indicium representations of language characters formed from one or a combination of character units including the steps of:
providing a keyboard having a plurality of keys each having at least one unit associated therewith, each unit being different from each other unit, and each forming at least a part of a character,
arranging the keys so that visually similar units are on the same or adjacent keys,
mapping each key to a mapped value which is a relational correlation with one or more relevant characters, generating one or more characters in response to a first or subsequent key actuation and constructing a character in accordance with the key actuations or the sequence of key actuations.