The present invention relates to a method of detecting a pedestrian, and a recording medium and a terminal for performing the method, and more specifically, to a method of detecting a pedestrian through feature information prediction based on an integral image to be applied to pedestrian detection that is performed in advance to avoid collisions with the pedestrian, and a recording medium and a terminal for performing the method.
According to a report on occurrence of traffic accidents (injuries and deaths) for the past decade by the National Police Agency, 200 thousand or more traffic accidents occur every year, 300 thousand or more people are injured in accidents and 5000 or more people die. A rate of pedestrian deaths with respect to total deaths from traffic accidents is as high as 40%. While safety devices such as mandatory air bags for the safety of drivers and passengers and mandatory seat belts are being significantly enforced, there is no safety device for pedestrians. Therefore, interest in pedestrian safety is increasing. Recently, laws mandating mounting of a vehicle pedestrian safety system have been enacted in places such as Europe and the USA.
In 2003, in Europe, in order to minimize damage resulting from shock when a vehicle collides with a pedestrian, regulations defining maximum impact for individual body parts were enacted. In 2008, regulations defining active safety requirements for detecting a pedestrian near a vehicle and providing a warning for a driver were enacted. In addition, an addition of an automatic braking system as a vehicle evaluation item in 2014 was announced.
Such a pedestrian protection system is broadly classified as pedestrian detection and collision avoidance control. Pedestrian detecting technology serving as a reference of collision avoidance is key technology. Forward obstacle detecting technology using radar has excellent accuracy, but it takes a long time for searching and it is difficult to respond to traffic accident events. Accordingly, research on uses of advanced computer vision technology and relatively inexpensive camera systems is underway.
In particular, recently, when a single channel feature is used to classify the pedestrian, an operation time is short since a feature extracting operation does not consume much time. However, in order to build a reference that can classify the pedestrian more clearly, a method in which image features of various channels are extracted and comprehensively used to detect an object was proposed. When a partial region of an input image and a model are compared in order to detect various pedestrians and a feature vector formed of a combination of features of the partial region is generated, a repetitive computation operation is performed. In this case, in order to decrease time consumption, a method of using an integral image of channel features is widely used. However, there is a problem in that a feature should be redetected whenever a size of the input image is adjusted.
In order to improve the above-described problem, in the paper of Dollár et al. (The fastest pedestrian detector in the west, P. Dollár, S. Belongie, and P. Perona, In the British Machine Vision Conference, 21 Aug. 3 Sep., 2010), a channel feature proposed by themselves was used, both a pedestrian model and an input image were changed, the number of operations was reduced, a feature was not redetected in an image that is changed to various sizes, a feature was predicted using only a scale value based on a feature of the input image, and therefore a feature extracting operation was simplified.
Meanwhile, as another method of decreasing the operation time, Benenson et al. (Pedestrian detection at 100 frames per second, R. Benenson, M. Mathias, R. Timofte, and L. Van Gool, In the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2903-2910, 16-21 Jun., 2012) proposed a method in which a search region itself was reduced based on distance information obtained through stereo vision.
However, according to the above method, in order to improve performance of pedestrian detection, various image features are used in combinations, and a plurality of repetitive search operations are necessary to detect the pedestrian in various sizes. In this case, a plurality of repetitive operations such an enlargement/reduction of the image, resulting feature extraction, and feature vector generation, and the like are performed. Therefore, in the related art, there is a problem in that an operation speed is low and separate hardware such as a stereo vision and a GPU needs to be used in order to decrease an operation time. Also, a system building cost significantly increases accordingly.