Electronic pictures and handwritten documents are entering common use in computers due to the introduction of pen based interfaces. Recent products have replaced the keyboard entirely by a pen with which all data entry is performed.
In a paper by D. Lopresti, and A. Tomkins, entitled "Pictographic Naming", INTERCHI '93 Adjunct Proceedings, April, 1993 (which is incorporated herein by reference for its teachings on use of pictographic names), the authors propose extending the range of acceptable document names to include arbitrary hand-drawn pictures. When a document is created or first stored in a storage medium, the author draws a pictographic name instead of typing in a textual name. To subsequently retrieve one of the documents, the pictographic names may be displayed in a menu or "browser", and the user selects the desired pictographic name.
If the database includes more than about 8-12 documents, it becomes impractical to display all of the pictographic names during retrieval.
In an alternative method to subsequently retrieve one of the documents, the pictographic name is redrawn using a pen based interface. Because the hand-drawn pictures are not drawn in exactly the same way each time, a pattern recognition technique is needed to determine which document (or output sequence) a given hand-drawn picture (or input sequence) is intended to represent.
One of the proposed techniques for identifying a document by its pictographic name involves the use of Hidden Markov Models (HMM) to provide a list of candidate documents that have pictographic names most closely resembling the input sequence. From this list, one file is selected using the pen. HMMs provide a powerful tool for picture and handwritten document matching. Several researchers have used HMMs to model handwriting and handwritten documents.
Rabiner, L. R., "A Tutorial on Hidden Markov Models and selected Applications in Speech Recognition", Proceedings of the IEEE, 77(2):257-285, February 1989, is incorporated by reference herein for its teachings on the use of Hidden Markov Models for pattern recognition.
Formally, an HMM is a doubly stochastic process that contains a non-observable underlying stochastic process (hidden) that is uncovered by a set of stochastic processes that produce the sequence of observed symbols. Mathematically an HMM is a tuple &lt;.sigma., Q, a, b&gt;,
where:
1) .sigma. is a (finite) alphabet of output symbols. A symbol is typically a subset of a character. PA1 2) Q is a set of states, Q={O, . . . , N-1} for an N-state model. PA1 3) a is a probability distribution that governs the transitions between states. The probability of going from state i to j is denoted by a.sub.ij. The transition probabilities a.sub.ij are real numbers between 0 and 1, such that: ##EQU1## The distribution includes the initial distribution of states, that is the probability a.sub.i of the first state being i. PA1 4) b is an output probability distribution b.sub.i (s) that governs the distribution of output symbols for each state. That is, b.sub.i (s) is the probability of producing the symbol s .epsilon. .sigma. while being in state i. These probabilities follow the rules: EQU For all i.epsilon.Q and s.epsilon..sigma.:0&lt;b.sub.i (s).ltoreq.1(2) EQU For all i.epsilon.Q, .SIGMA..sub.s.epsilon..sigma. b.sub.i (s)=1(3) PA1 (i) determining the probability that a subset of one of the sequences of pointers leading from the root node to that node represents a subset of the output symbols in one of the documents being indexed; PA1 (ii) invoking the procedure for the next level, if the determined probability exceeds the minimum probability value of that level and the next level is between the one.sup.th level and the T-1.sup.th level; and PA1 (iii) adding a pointer to the one document in the list of pointers of the leaf node associated with that sequence of pointers, if the next level is the T.sup.th level and the probability is greater than the threshold value.
Usually, when HMMs are used, the transition probabilities (a) and the state set (Q) are computed by bestfitting the model to a series of samples. (This is known as training the model). Each sample consists of a sequence of output symbols (points), with which the parameters of the model may be adjusted. However, in applications such as recognition of handwritten documents, the model is described using a single sample (a sequence of output symbols for the document that is to be indexed). Quite commonly, then, the structure of the model is "fixed" to make up for the lack of samples with which to train it. That is, once a model is selected for an index, that model is used for the life of the index. The model is not changed dynamically after the index is created. For example, a left-to-right HMM may be used, i.e. a model in which it is only possible to remain in the current state or to jump to the next state in sequence.
For the handwritten document problem, each picture or document in the database is modeled by an HMM. As a result, given an input pattern, the recognition process involves executing each HMM in the database and selecting the one that generates the input pattern with highest probability. This is very time consuming. The primary impediment to using HMMs is execution speed, especially in the context of large databases. Executing a respective HMM for each document in the database in real-time to retrieve one of the documents introduces an unacceptable delay into the process of retrieving a document, making the use of pictographic names by this method impractical.