Object detection is a basic research topic in the field of computer vision and has a wide range of application prospects in many aspects of face recognition, safety monitoring, dynamic tracking and the like. The object detection means that, for any given image, a particular object (such as a face) therein is detected and recognized and the position and size information of the object is returned, for example, a bounding box surrounding the object is output. The objection detection is a complex and challenging pattern detection issue and its main difficulties lie in two aspects. The one aspect is caused due to internal changes such as changes in details, occlusion and the like of the object, and the other aspect is resulted from changes in external conditions such as an imaging angle, illumination influence, the focal length of an imaging device, an imaging distance, different ways to acquire the image, and the like.
Object detection methods based on deep CNNs (convolutional neural networks) are more advanced object detection methods presently. The present CNN-based object detection methods generally include three steps: 1) extracting several candidate regions which may be an object to be detected from an image by utilizing a conventional region proposal method; 2) inputting the extracted candidate regions to the CNN for recognition and categorization; and 3) employing the bounding box regression technique to refine the coarse candidate objects into more accurate object bounds. However, there are still technical problems such as influenced by internal changes of the object, inaccurate object recognition, low detection efficiency and the like in the detection results acquired by the current CNN-based object detection methods.