1. Field of the invention
The present invention relates to signal analysis, and more specifically, to an apparatus and method for syntactic signal analysis. The invention includes a method for coding features, and a method for converting a pixel pattern representing lines of text obtained by means of an optical scanner, into a string of primitives and features suitable for pattern recognition.
2. Discussion of Related Art
Signals are processed in a large number of technical fields. However, these signals are sometimes mutilated caused by various reasons. In most cases, it is advantageous to have available unmutilated signals. If it is known beforehand that the unmutilated signals must satisfy certain rules, it is possible to check whether the available signals really do satisfy these rules and, if this does not appear to be the case, to discard or correct the signals.
A technique known in the art for checking whether signals satisfy certain rules is the syntactic method or parsing technique. For an introduction to this see, respectively, "Digital Pattern Recognition", Springer-Verlag (1980), by K. S. Fu (editor), pp. 95-134, and "Digital image processing", John Wiley & Sons (1978), by W. K. Pratt, pp. 574-578. In this method, the signal is first segmented into primitive elements. These primitive elements, which will hereinafter be designated primitives, are then classified and replaced by a normalized primitive. A normalized primitive of this kind is a prototype of a certain class or category. A check is then made whether a string of normalized primitives obtained in this way satisfies a grammar. A grammar comprises a number of rules, known as rewriting or production rules, each consisting of a number of terms, terminals or non-terminals, and each predetermining a way in which a set of terms may be rewritten to form a non-terminal. The grammar also defines a starting symbol. The starting symbol is a non-terminal. The defined classes or categories and accordingly the normalized primitives correspond one by one to the terminals from the grammar.
A bottom-up parsing procedure consists of applying suitable rewriting rules to a string of normalized primitives and to intermediately obtained rewriting results until the starting symbol is obtained. The application of rewriting rules in each case means joining simpler terms together to form more complex terms and finally results in a hierarchal composition of the signal in the form of a tree structure, also known as a parsing tree, solution tree or analysis, having the starting symbol at the root, the normalized primitives at the leaves, and the non- terminals of the applied rewriting rules at the intermediate nodes. The grammar defines all such hierarchical compositions which are permissible.
If at a certain time it is not possible to apply any further rewriting rule in the parsing procedure when the starting symbol has not yet been found, then the signal does not comply with the grammar and the signal is rejected. It is then possible to adapt one or more primitives or normalized primitives so that the rules in the grammar are complied with: the signal is then corrected. A correction method of this kind is shown in "Digital Image Processing", by W. Pratt, p. 571, FIG. 20.2-3.
If the parsing procedure actually delivers the starting symbol, then the signal complies with the predetermined rules and it has one of the permissible hierarchical compositions. If verification of the signal is all that is required, then the fact that the starting symbol has been reached is sufficient to prove that the signal complies with the rules. In the event that recognition of the signal is also the purpose of the apparatus, the objects for recognition can be derived directly from the analysis provided. An object for recognition is represented by the non-terminal symbol which is the root of a sub-tree corresponding to the object for recognition. If a signal has been converted into a string of recognized symbols in this way, then it is possible to subject the resulting symbols after some intermediate operations, if necessary, to a following parsing step in which a check is made whether the string of recognized symbols complies with another system of rules. This can in turn lead to adaptation of the symbols, and this can in turn lead to adaptation of the signal. It is also possible that this parsing step will again yield a structural description with which further processing is carried out.
Various types of parsers are known. A robust parser, i.e. a parser which also processes signals which do not comply with the rules in the grammar, together with a correction mechanism co-operating with the parser, is described in EP-A-0361570. In "Efficient parsing for natural language", Kluwer Academic Publishers (1986), M. Tomira describes a "generalized LR parser". This parsing technique is based on the LR (Left-to-right Rightmost) parsing technique, a technique generally known in the art, introduced by Knuth in 1965. Tomita extended this LR technique by making it suitable for ambiguous input, in which case the parser is capable of giving more than one structural description for such an input. An example of ambiguous input in natural language is the input sentence: "He saw a man with a telescope", in which the phrase "with a telescope" may be either an adverbial clause qualifying the verb "saw", or an adjectival clause qualifying the noun "man". The GLR parser is at the present time one of the quickest and most efficient parsing techniques.
There are a number of technical fields in which the parsing technique is applicable for checking and correcting signals. Optical character recognition systems are systems intended to read text in by means of an optical scanner and convert it to a form which can be processed by a computer. The electrical signal delivered by the optical scanner is thresholded and stored in a memory in a pixel pattern. This pixel pattern will not be an accurate image of the original, for example because of inaccuracies in the optical system, uneven illumination, or rounding-off errors during thresholding. It is also possible that the original for scanning is of poor quality, the characters are eroded, mutilated, or are handwritten characters. It is well known, however, that the signal obtained must represent characters. This knowledge can be utilized by having a parser check whether the pixel pattern actually has structures which represent characters. For this purpose, the pixel pattern must be converted to a series of primitives and normalized primitives, which normalized primitives correspond directly to terminals from the grammar, whereupon the terminals are tested against the rewriting rules of the grammar. In "Digital Pattern Recognition" by K. S. Fu (editor), p.103 gives primitives for handwritten English characters and p.110 ff. gives a number of grammars which describe specific patterns.
A following field of application is the field of object recognition. If only a small number of objects has to be recognized, it is possible to compare the input signal originating from a detection means with all the permissible objects and to decide which object best agrees with the input signal. If, however, a large number of objects is involved, it is more advantageous to describe each object as a set of primitive elements. These description rules form the grammar. An input signal is segmented into the primitive elements and then a parser checks whether the rules of the grammar are complied with and the parsing process delivers as a result what object corresponds to the structure found. In "Digital Pattern Recognition" by K. S. Fu (editor), p.113 inter alia describes a grammar for the recognition of chromosomes.
The syntactic method can also be advantageously used in speech recognition. In "Digital Pattern Recognition" by K. S. Fu (editor), p.177 gives an example of a speech recognition system. In this apparatus an acoustic processor is followed by a linguistic processor. After a number of processing operations the acoustic processor delivers a string of phonemes. This string of phonemes is fed to the linguistic processor which, inter alia on the basis of syntactic rules, converts the string of phonemes into a string of words.
Another example of a technical field in which parsing techniques are applicable for checking and correcting signals is the area of "natural language interface" systems. A system of this kind is aimed at assisting man in his interaction with a computer system. This is of the greatest importance, for example, for enabling the layman to consult a database. A "natural language interface" of this kind must allow the user to formulate complex questions in natural language. Sentences input by the user in natural language are analyzed by a parser and then deliver a number of questions in a form suitable for feeding to the database. A system of this kind is discussed in the textbook by M. Wallace: "Communicating with databases in Natural Language", Wiley & Sons Inc. (1984).
European application EP-A-0 513 918 describes a spelling checking system. The use of a parser for this application makes such spelling checking systems more versatile and more accurate. It is not just an isolated word that is checked but also the inflections of the words and the syntax of the sentence. As a result, a number of possible alternatives for a wrongly spelled word are very greatly reduced. In addition, a parser is, by the nature of things, also a suitable instrument for use as a grammar checker for a natural language.
In machine translation systems, parsers are suitable for analyzing the sentence for translation and to synthesize a translated sentence from the translated words with their grammatical functions. The use of a parser in such an application will be found in EP-A-0 357 344.
It is also possible, using a parser, to reconstruct mutilated signals originating from a corrupt mass memory system or a poor communication channel since the signals obey rules that are known beforehand.
A parser can also advantageously be used for indexing systems. The object of this type of system is to make a list of index words for a set of texts. For this purpose, nouns and verbs reduced to a normalized form (e.g. singular and infinitive respectively) are extracted from the texts by means of a parser. A system of this kind is described in the article by C. Berrut and P. Palmer: "Solving grammatical ambiguities within a surface syntactical parser for automatic indexing", ACM Conference on Research and Development in Information Retrieval (1986).
As is apparent from the foregoing, the parsing technique is applicable to a large number of areas of the art.
An important improvement in parsing techniques is obtained by using feature unification. Primitives from which the input signal is constructed are provided with features for this purpose. These features specify a primitive in greater detail. For this purpose the terms from the grammar rules are also provided with features. During the parsing process the features of the primitives are tested against the features of the corresponding terms in the grammatical rules. The features can also be passed on to terms which can then in turn be further tested in the parsing process against a following applicable grammatical rule. Feature unification enables more complex structures to be processed with parsing techniques.
An example of feature unification is described in the article "The Generalized LR Parser/Compiler V8-4: A software package for practical NL projects", published in Proceedings of the Coling-90, Helsinki 1990, by M. Tomira. However, a disadvantage of the method described there for performing feature unification is the complexity of the features processing. Although a parser can, in this way, be made suitable for processing more complex structures, it is very detrimental to the utility of the parsing method for practical purposes since a large number of extra steps must be added to the parsing method.