The present disclosure relates to machine learning.
Traffic accidents kill over 1.2 million people a year worldwide, and more than 30,000 people die in US alone annually according to the reports from World Health Organization's global status report on road safety and National Highway Traffic Safety Administration. Many of the accidents are caused by the risky driving behaviors, which could be preventable if these behaviors could be predicted and drivers warned, and/or compensation strategies generated in advance, even just a few seconds. Generally, current state-of-the-art Advanced Driver Assistance System (ADAS) solutions are unable to provide high-precision driver behavior prediction in a cost-effective manner due to the limitations in their systems/models.
Some existing approaches attempt to predict driver behavior using only limited data related to driving. For instance, He L., Zong C., and Wang C., “Driving intention recognition and behavior prediction based on a double-layer hidden Markov model,” Journal of Zhejiang University-SCIENCE C (Computers & Electronics), Vol. 13 No 3, 2012, 208-217, describes a double layer Hidden Markov Model (HMM) that includes a lower layer multi-dimensional Gaussian HMM performing activity recognition and an upper layer multi-dimensional discrete HMM performing anticipation. However, this model only considers Controlled Area Network (CAN) data such as breaking, accelerating, and steering, and fails to account for important features that affect driving, such as road conditions, location familiarity and steering pattern of a driver.
Some approaches require feature extraction before driver behavior recognition and prediction. For instance, Jain, A., Koppula S., Raghavan B., Soh S., and Saxena A., “Car that knows before you do: anticipating maneuvers via learning temporal driving models,” ICCV, 2015, 3182-3190, considers an elaborate multi-sensory domain for predicting a driver's activity using a Auto-regressive Input-Output HMM (AIO-HMM). In a first step, Jain describes extracting features from input signal data, such as high-level features from a driver-facing camera to detect a driver's head pose, object features from a road-facing camera to determine a road occupancy status, etc. However, Jain's approach requires a substantial amount of human involvement, which makes it impractical for dynamic systems and possibly dangerous. Further, the number of sensory inputs considered by Jain is not representative of typical human driving experiences, and the model is unable to consider important features affecting driver's action, such as steering patterns, local familiarity, etc.
Some approaches, such as Jain A., Koppula S., Raghavan B., Soh S., and Saxena A., “Recurrent neural networks for driver activity anticipation via sensory-fusion architecture,” arXiv:1509.05016v1 [cs.CV], 2015, describe using a generic model developed with data from a population of drivers. However, a model like Jain's is unable to adequately model and predict driver behavior and thus reduce the risk of an accident from occurring. In particular, Jain's model is based on a Long-Short Time Memory Recursive Neural Network (LSTM-RNN), and is trained using a backpropagation through time (BPTT) algorithm. However, training this model can be computationally expensive, and memory limitation of the BPTT algorithm can limit the maximum achievable horizon for driver behavior prediction. The model further suffers from a precision vs. recall tradeoff. Moreover, since the model only tries to minimize the anticipation error over the horizon, it offers reduced flexibility on design and embodiment choices.