Of all of the applications where computer vision is used, face detection presents an extremely difficult challenge. For example, in images acquired by surveillance cameras, the lighting of a scene is usually poor and uncontrollable, and the cameras are of low quality and usually distant from potentially important parts of the scene. Significant events are unpredictable. Often, a significant event is people entering a scene. People are typically identified by their faces. The orientation of the faces in the scene is usually not controlled. In other words, the images to be analyzed are substantially unconstrained.
Face detection has a long and rich history. Some techniques use neural network systems, see Rowley et al., “Neural network-based face detection,” IEEE Patt. Anal. Mach. Intell., Vol. 20, pp. 22–38, 1998. Others use Bayesian statistical models, see Schneiderman et al., “A statistical method for 3D object detection applied to faces and cars,” Computer Vision and Pattern Recognition, 2000. While neural network systems are fast and work well, Bayesian systems have better detection rates at the expense of longer processing time.
The uncontrolled orientation of faces in images poses a particularly difficult detection problem. In addition to Rowley et al. and Schneiderman et al., there are a number of techniques that can successfully detect frontal upright faces in a wide variety of images. Sung et al., in “Example-based learning for view based face detection,” IEEE Patt. Anal. Mach. Intell., volume 20, pages 39–51, 1998, described an example-based learning technique for locating upright, frontal views of human faces in complex scenes. The technique models the distribution of human face patterns by means of a few view-based “face” and “non-face” prototype clusters. At each image location, a different feature vector is computed between the local image pattern and the distribution-based model. A trained classifier determines, based on the difference feature vector, whether or not a human face exists at the current image location.
While the definition of “frontal” and “upright” may vary from system to system, the reality is that many images contain rotated, tilted or profile faces that are difficult to detect reliably.
Non-upright face detection was described in a paper by Rowley et al., “Rotation invariant neural network-based face detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 38–44, 1998. That neural network based classifier first estimated an angle of rotation of a front-facing face in an image. Only the angle of rotation in the image plane was considered, i.e., the amount of rotation about z-axis. Then, the image was rotated to an upright position, and classified. For further detail, see, Baluja, et al., U.S. Pat. No. 6,128,397, “Method for finding all frontal faces in arbitrarily complex visual scenes,” Oct. 3, 2000.
FIG. 1 show the steps of the prior art face detector. A rotation of a front facing face in an image 101 is estimated 110. The rotation 111 is used to rotate 120 the image 101 to an upright position. The rotated image 121 is then classified 130 as either a face or a non-face 131. That method only detects faces with in-plane rotation. That method cannot detect faces having an arbitrary orientation in 3D.
Therefore, there is a need for a system and method that can accurately detect arbitrarily oriented objects in images.