Machine learning technology is continually evolving and has come to support many aspects of modern society, from web searches, content filtering, automated recommendations on merchant websites, automated game playing, to object detection, image classification, speech recognition, machine translations, and drug discovery and genomics. The current state of the art in the field of machine learning are deep neural networks, which use computational models composed of multiple processing layers which learn representations of data (usually, extremely large amounts of data) with multiple levels of abstraction—hence, the terminology “deep learning”, “deep networks,” etc. See, e.g., LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” Nature, vol. 521, pp. 436-444 (28 May 2015), which is hereby incorporated herein by reference in its entirety.
Deep learning approaches have shown excellent performance for general object detection. However, the detection of certain objects and/or certain situations have been more difficult, even using deep learning. Pedestrian detection, which has many real-world applications, such as autonomous driving and advanced driving assistance systems, is one area where detection via deep learning has had somewhat limited results.