The present invention relates to a method for addressing and searching a unit or word to be searched from an assembly of such units or words arranged in order of the Japanese syllabary or of the alphabet as catchwords in a dictionary.
In an attempt to form a dictionary or the like into a corresponding information file, for example, the file is generally characterized in that the code length necessary for memory of the head portion and the content portion of each word (i.e., each unit object to be searched) involves a large range of variation depending on the particular word. Therefore, memory spaces must be allotted and maintained in anticipation with due regard to such variation and these memory spaces are accordingly addressed to permit the search for the corresponding words. According to such a procedure, however, the memory spaces would be unreasonably large. The word itself, on the other hand, may be variable from one character word to a dozen character words and, therefore, the length of a word is variable. Additionally, character combinations follow no particular rule and the number of words are thus countless.
If the conventional procedure is employed to search the object as set forth above, a large number of bits will be necessary for a large number of words even when each word contained in a group of words (i.e., assembly of unit objects to be searched) is coded by a serial number indicating the order in which each word of said group of words is arranged. As a result, a large number of detector elements and processor mechanisms will be required and the code portion of the file will be large. Accordingly, the speed at which the object is detected or searched would be necessarily low in practical operation.
In addition to the problems set forth above in connection with the information capacity, there is another requirement that the device may establish or the operator may know what position in the assembly of arranged and coded unit words is occupied by a particular word. However, it would be difficult in view of the irregularity peculiar to the character combination of each word to realize the former and it would be impossible in view of limited human ability and large number of words contained in a dictionary or the like to realize the latter unless the address of said particular word to be searched is indicated by another dictionary. Here there would no longer be any efficiency of the mechanical search. It is demanded and desired, therefore, that the operation of search be achieved in such a manner that the elementary factors which form a word are successively put in the order of character combination particular to the word by operating members such as depression keys corresponding to 51 characters of the Japanese syllabary or 26 characters of the English alphabet. Although such mechanical input or reading out has been commonly employed in the information transfer system such as Telex, the information search would require a much more bulky structure of the device in view of the required memory capacity. With respect to combinations of the alphabet, for example, each of 26 alphabet characters may be coded by 5 bits so that 100 bits should be allotted for code marks of each word if the maximum length of a single word is given as the length of 20 characters. The device would be highly costly because of the large space required by the identification marks and the complexity of the detector elements and processing mechanisms.
To overcome the above disadvantages the present invention provides an improved method in which the number of bits which form code marks for identification or designation of unit objects to be searched (i.e. words) which are successively arranged may be substantially reduced by making the best use of the context in said successive arrangement.