1. Field of the Invention
This invention relates to the field of image processing, and in particular to pattern recognition for image classification.
2. Description of Related Art
Pattern recognition techniques are often used to classify images, or portions of images. A particular application of pattern recognition for classification is the xe2x80x9cvisualxe2x80x9d inspection of manufactured items. Traditionally, human inspectors view the item at various stages of the manufacturing process, looking for telltale evidence of a manufacturing error. This staged approach is used to identify a defective item before subsequent costs are incurred by continuing the manufacture of the defective item, and/or to identify the defect at a stage when the repair of the defect is economically feasible. Generally speaking, the cost of defect detection and correction increases exponentially with each processing step.
A particular example of visual inspection is in the manufacture of a display device, such as a Cathode Ray Tube (CRT). The manufacture of a CRT is a multi-step process. Various layers of materials are applied to the interior of the display surface, for example, before the tube is sealed and evacuated. A critical defect in the application of any layer of material renders the CRT unsuitable for sale, and therefore inspection procedures are implemented to identify such defects before the tube is sealed. Often, critical defects are correctable, for example, by xe2x80x9cundoingxe2x80x9d the manufacturing steps back through the step at which the critical defect was introduced. In the CRT process, for example, layers of material can be removed, or defects can be corrected before the next layer is applied. Also, if the error in manufacturing that caused the defect is systematic, rather than random, a rapid identification of a defect can minimize the number of subsequent items produced by this systematic error. Manufacturing errors often produce visually apparent anomalies that have characteristics that can be used to identify the particular manufacturing error that caused the anomaly. In the CRT example, one of the internal components is a xe2x80x9cshadow maskxe2x80x9d that has a hole corresponding to each pixel location on the screen, through which electron beams travel before impinging upon the luminescent red, green, and blue phosphor dots corresponding to the intended color content of each pixel. A defect in the shadow mask is visually apparent as an anomaly that spans all three phosphor dots. If three adjacent red, green, blue dot locations are substantially dimmer than surrounding dots, it is highly likely that the corresponding hole in the shadow mask is obstructed, and appropriate corrective measures can be taken to clear the obstruction before other manufacturing steps are taken that would make the correction economically infeasible. Anomalies with different visual characteristics imply other classes of defect, and different corrective measures would typically be applied to correct each class of defect. The automation of the defect classification task is effected by processing an image of the screen to search for the characteristic patterns, such as three adjacent dim dots, for each defect class.
Another application of pattern recognition is the classification of images for subsequent retrieval. For example, one might classify each painting in an art collection into portraits, landscapes, figures, seascapes, and so on. An automation of the classification task may include, for example, scanning the painting for a large central area of flesh-color (portrait), a large upper area of blue and/or white, with little, if any, blue content below (landscape), and so on. Edge detecting and edge characterization processes are also commonly used to classify image content.
A variety of techniques are available for classifying images based on characteristic patterns. As presented above, algorithms, or rules, can be defined corresponding to each classification, such as xe2x80x9cif 3-adjacent dim dots, then shadow mask defectxe2x80x9d, or xe2x80x9cif upper area is blue and lower area is not blue, then landscapexe2x80x9d, and so on. Such systems are viable when a set of rules can be determined for each classification. The rules for each classification must be broad enough to encompass the possible range of characteristic patterns within each classification, yet not so broad so as to include images that do not properly belong within each classification.
As an alternative to a rule based system that requires specific rules, learning systems are commonly employed to develop characterization processes based on representative samples of each classification. Neural networks are commonly employed to effect such learning systems. A conventional neural network comprises one or more input nodes, one or more output nodes, and a plurality of intermediate, or hidden, nodes that are arranged in a series of layers between the input and output nodes. In a common neural net architecture, each input node is connected to each hidden node in a first layer of nodes, each hidden node in the first layer of nodes is connected to each hidden node in a second layer of nodes, and so on until each node of the last layer of hidden nodes is connected to each output node. The output of each node is a function of a weighted combination of each input to the node. In a feedforward neural net, when a set of input values is applied to the input nodes, the weighted values are propagated through each layer of the network until a resultant set of output values is produced. Other configurations of nodes, interconnections, and effect propagation are also common.
In a learning mode, the resultant set of output values is compared to the set of output values that a properly trained network should have produced, to provide an error factor associated with each output node. In the case of pattern matching for classification, each output node may correspond to a particular class. The output node of the true class corresponding to the set of input values should have a large output value, while the incorrect class output nodes should have a low value. The error factor is propagated back through the network to modify the weights of each input to each node so as to minimize a composite-of the error factors. The composite is typically the sum of the square of the error factor at each output node. Conceptually, the node weights that contributed to the outputs of the incorrect class are reduced,while those that contributed to the output of the correct class are increased.
Thereafter, the next input set of values is applied with the adjusted weights, the error factors are recomputed, and the weights are readjusted. This process is repeated for each set of input values used for training, and then the entire process is repeated for a fixed number of iterations or until subsequent iterations demonstrate a convergence to the correct class, or until some other termination criterion is achieved. Once the set of weights is determined, the resultant network can be used to classify other items, items that were not part of the training set, by providing the corresponding set of input values from each of the other items, and choosing the class having the highest output node value. Note that the magnitude of the feedback modification to the weights is chosen to balance between overcorrecting and undercorrecting. An overcorrection of weights for each training input set may result in an oscillation of weight values that preclude convergence; an undercorrection of weights for each training input set may require an excessive number of iterations to reach convergence, or may converge to a local minimum. The magnitude of the modification of weights may be different for each layer in the network.
The performance of the neural network for a given problem set depends upon a variety of factors, including the number of network layers, the number of hidden nodes in each layer, the weight adjustment factors, and so on. Given a particular set of network factors, or network architecture, different problem sets will perform differently. That is, the performance of the neural network used for classification based on pattern recognition will depend upon the selected architecture for the neural network and the various factors associated with this architecture. In like manner, the performance of other pattern recognition systems, such as rule based systems will depend upon the particular parameters selected for the recognition process.
It is an object of this invention to provide an image classification system that does not require an a priori definition of rules. It is a further object of this invention to provide an image classification system that does not require an a priori definition of a specific learning system architecture.
These objects and others are achieved by providing an evolutionary algorithm that provides alternative architectures and parameters to an image classification system. In a preferred embodiment, a learning system is employed, and during the training period of the learning system, the architecture of the learning system is evolved so as to create a learning system that is well suited to the particular classification problem set. In like manner, other parameters of the image classification system are evolved by the evolutionary algorithm, including those that effect image characterization, learning, and classification. An initial set of parameters and architectures are used to create a set of trial classification systems. A number of pre-classified evaluation images are then applied to each system, and each system""s resultant classifications for each test case is compared to the proper classification of each test case. Subsequent trial classification systems are evolved based upon the parameters and architecture of the better performing classification systems. The best performing classification system is then selected as the production classification system for classifying new images.