Pattern recognition has been studied for over three decades. However, recent advances have allowed pattern recognition to become widely used in various applications, such as, face recognition systems, character/handwriting recognition systems, biometric recognition systems, video surveillance systems, etc.
FIG. 1a shows a conventional pattern recognition system for recognizing faces. As shown in FIG. 1a, a camera 102 takes an image of scene 100 and provides an image signal to a recognizer 104. Typically, recognizer 104 includes a face detector module 106 and a face matcher 108.
Face detector module 104 detects a portion of the image signal (i.e., an image of scene 100) that is relevant for matching. For example, as illustrated in FIG. 1b, face detector module 106 may detect a face of a person in an image portion 130 of scene 100. After detecting image portion 130, face detector module 104 then segments image portion 130 from the image signal. For example, as illustrated in FIG. 1c, face detector module 106 may isolate image portion 130, e.g., the person's face, to form an image segment 140.
Face matcher 108 receives image segment 140 from face detector module 106. Face matcher 108 includes a feature extractor module 110 and a matching module 112. Feature extractor module 110 extracts any relevant features identified in image segment 140. For example, as illustrated in FIG. 1d, feature extractor module extracts features 142, 144, 146, and 148, e.g., location of the eyes, distance between the eyes, location of the nose, and location of the mouth, from image segment 140.
Matching module 112 searches a memory 114 to find a stored pattern (not shown), which matches image segment 140 based on features 142,144, 146, and 148. Matching module 112 typically makes a decision as to which stored pattern or patterns match image segment 140 according to a predetermined decision rule.
Output module 116 receives the decision from matching module 112 and outputs the decision to a user. For example, as shown in FIG. 1a, output module 116 may output the three closest matching faces to a display 118.
Unfortunately, due to variation factors, conventional pattern recognition systems are often inaccurate. For example, variation factors such as scale, e.g., caused by a person being either closer or farther away, and rotation, e.g., caused by a person being slightly turned relative to camera 102, may cause the input image to not be matched to its corresponding stored pattern. Also, detecting relevant portions, e.g., image portion 130, of certain types of images, such as gray-level images, may be incomplete or imprecise and may cause an image portion to have missing or extraneous features. Thus, conventional pattern recognition systems often produce erroneous recognition decisions.
In addition, conventional pattern recognition systems often use a single recognizer. A single recognizer makes a recognition decision based on a single recognition algorithm. Many efforts have been made to develop more sophisticated recognition algorithms to improve the performance of conventional single recognizer systems. However, such systems using sophisticated recognition algorithms are still prone to inaccurate results because of the above-mentioned variation factors.
Stricter normalization of images during pre-processing has also been studied as a way to improve performance. For example, normalization of images may be used to minimize the effect of variation factors, e.g., such as scale and rotation. For example, images may be normalized to a fixed size. However, some variation factors, e.g., border shift, are difficult to detect. Also, even if a variation factor is detected, it may be difficult to accurately compensate for its effect to ensure an accurate recognition decision.
Finally, some conventional pattern recognition systems may combine several recognizers using multiple recognition algorithms and/or modules thereof to enhance recognition accuracy. For example, combining several different recognizers with different matching modules may increase accuracy since the different recognizers may complement each other in a group decision. However, combining several recognizers is expensive to implement, and requires a large amount of memory.
In addition, under certain circumstances, the combined recognizers may not complement each other, since they may disagree on a recognition decision. Combining several recognizers also requires a large amount of samples to “train” the recognizers to work together. Therefore, combining several recognizers often results in a complex system, which may not perform well when only a relative small number of training examples are available. It is accordingly desirable to improve recognition accuracy using relatively simple and inexpensive systems.