Access to data objects in large data collections using limited input interfaces may be a problem for a wide range of consumer and industrial applications and devices. These objects may comprise media files, goods, books, personal information, etc. An important feature of such objects may be that they have a simple or composite textual description, which is used for the access, and input of this description is equivalent to the access to an object. Typical user interfaces may provide limited number of possible inputs, such as buttons, touch screen areas, or gestures. For example, one application of such an access system is generic text input for mobile devices, where objects descriptions are words of some language dictionary or corpus.
Any human language may have hundreds of thousands or millions words and word forms to describe the world around us. It was a challenging problem to represent all these words in a writing form to store them as information. Semantic writing systems of ancient human languages, such as Egyptian, Maya, or Chinese, were object-based and used symbols of different types: pictograms resembling objects they represent, logograms for parts or forms of words, and ideograms for abstract ideas. The scripts of modern semantic languages, such as Chinese and Japanese, simplified and changed over the time, but still may include up to tens of thousands basic symbols, without any upper limit for a number of more sophisticated symbols.
The huge step for mankind was the development of the phonetic writing system, which had a tremendous impact for human communication and progress. Phonetic writing systems are based on symbols representing basic phonemes: sounds and groups of sounds of the spoken language, but not objects and notions. Therefore, each word can be represented by a sequence of basic symbols corresponding to an access path in some hierarchical dictionary tree with branches corresponding to basic symbols. The number of basic symbols in phonetic scripts may range from 20-40 for languages using alphabets (Latin, Cyrillic, Hebrew, Korean, etc.) to 40-100 for syllabic languages (Japanese Katakana and Hiragana, most of Indic, Pacific and South-Eastern languages).
With the development of computers and electronic devices, data input has become one of the fundamental problems of computer-human interaction. Typical data input systems are usually based on language writing systems with elementary input actions corresponding to symbols of language script system. Any language object may be represented by a sequence of these elementary input actions. The most common implementation of such script-based input system is the keyboard with input keys representing symbols of scripts. Due its simplicity, the keyboard input approach is a desirable input method for a broad range of the alphabet languages and electronic devices having enough real estate for a keyboard (key sizes being constrained by sizes of human fingers).
A keyboard may become a cumbersome and inconvenient solution for languages using scripts with a number of symbols greater than the number of keys of regular keyboards. The typical approaches for text input for such languages are based on converting native symbols into some simple artificial sequential form. It may be a phonetic representation based on a script of a foreign language, such as Pinyin, Jyutping, or artificial scripts, such as Zhuyin (Bopomofo). This approach may be inconvenient due to differences in pronunciation and transcription of the same word in different regions (e.g. Mandarin, Cantonese) and may require study of a new scripting system.
Another approach is graphical decomposition of a structure of a symbol representing number, order and interpositions of basic strokes or glyphs, such as Cangjie, and 4 corners. Such artificial representations may lead to a large numbers of necessary input actions or/and number of synonyms represented by these simplified input actions. The next step is sequential input of artificial representation. If sequential representation has several synonyms, then an additional step of selection of the desired word in a list of native words may be required.
Miniaturization of electronic devices in general, development of mobile devices especially, and adding of communication functions to typical devices with limited input interfaces may cause issues with text input. Limited surface area or compact design of a device may not provide enough space for a convenient keyboard, but miniature keyboards may be difficult to use and may increase the sizes of devices, which may be undesirable. The typical device may include hardware providing a limited number of input actions, such as a keypad, a directional pad, a plurality of buttons, a plurality of switches, and touch and gesture pads.
The number of basic input actions may be much less then the number of symbols even in alphabet-based languages, such as in reduced keyboard applications. An approach for input systems with limited interfaces may comprise combining symbols of script into a few symbol groups corresponding to input actions. Furthermore, a user may either select a symbol within a group by additional actions (e.g. Multi-tap), or the input system may generate a list of all words represented by these input actions (e.g. T9) and additional word selection step may be required. To optimize selection procedures for mobile text input, some approaches may utilize different prediction methods based on additional statistical information about frequency of use of letters and letter combinations (e.g. LetterWise), or words (e.g. T9, WordWise). They place most likely candidates into the beginning of candidate lists to speed-up selection. The input for non-alphabet languages at devices with limited input interfaces may be an even more difficult process. It may combine both conversion of native symbols into some sequential form and grouping of elements of sequential representation. In some approaches, the native symbols may be represented as sequences of very few elemental strokes. For example, the 5 strokes method for Chinese language phone input uses only five basic strokes: horizontal, vertical, diagonal, dash, and hook. A similar issue may arise for touch input interfaces. The size of human fingers may constrain sizes of touch input elements, and many interfaces designed for mouse point input may become cumbersome.
Phonetic keyboard input methods may be inefficient when number of input objects is small or limited by other properties, application, or content. These object lists could be: days of week, months, names of authors or singers, titles of books or songs, names of towns or states, commands of an application, etc. In these cases, object lists include much less objects then the whole language word corpus and use of generic phonetic input methods are excessive. Simple selection methods (e.g. scrolling lists, menus, pie charts, hierarchical menus) and prediction may be used for data input for object lists, but very often, their structure is difficult to use and is non-optimal.
Another area of application of access system is online databases storing information about a large number of objects having a composite textual description, such as online stores, directories, libraries, and catalogs. Usually, individual fields in a composite description are provided in a list description, so input of the description is a combination of methods used for list input with all their limitations.
The efficient object access has some elements in common with computer search applications. Computer search algorithms may be well studied in Computer Science, and there are proven optimal data structures and algorithms, such as optimal search and alphabetic trees. These optimal computer algorithms and structures may not be well suited for human use. The issue may be that a computer needs the whole precise description to be already entered before the search. Another principal aspect is that humans can easily solve many cognitive problems, which require very sophisticated computer algorithms. This happens due to a native parallelization of human thinking and vision, and the ability to compare, value and categorize objects without clearly described or sophisticated properties. For example, a human can easily compare two written Chinese words or know that a phone is a mobile electronics device, but it's a very difficult task for the computer. A human can also simultaneously compare and select from several objects. Yet another problem is that data structures used in computer search algorithms are subdivision trees, which are not optimal for data access.