The ability a driver of a car to look at a person—who is walking, driving another car, or riding a bike on or near a street—and predict what that person wants to do may be the single most important part of urban driving. For example, when a driver of a car sees people near the car, determining whether one person will cross the street, whether another person will remain standing on a street corner, and whether yet another person will change lanes on his or her bicycle is necessary to safely drive the car and avoid hitting the people. This ability is so fundamental, that operating in cities without it would be nearly impossible.
Fortunately, human drivers have such a natural ability to predict a person's behavior. In fact, they can do it so effortlessly, that they often do not even notice that they are doing it. However, computers and autonomous driving vehicles cannot adequately predict the behavior of people, especially in urban environments.
For example, autonomous driving vehicles may rely on methods that make decisions on how to control the vehicles by predicting “motion vectors” of people near the vehicles. This is accomplished by collecting data of a person's current and past movements, determining a motion vector of the person at a current time based on these movements, and extrapolating a future motion vector representing the person's predicted motion at a future time based on the current motion vector. However, the methods do not predict a person's actions or movements based on other observations besides his or her current and past movements, which lead to inferior results in predicting the person's future behavior.