The present application relates to road scene understanding.
One of the central goals of 3D scene understanding is to localize the 3D positions and orientations of objects in complex scenes. For instance, using stereo imagery, several visual cues are combined to simultaneously determine object locations and a rough intersection topology. 3D localization in road scenes from the monocular video is an important problem for applications in autonomous driving. Conventional systems have also considered monocular frameworks. Notably, occlusions are handled by considering partial object detectors. A detailed part-based representation of objects based on annotated CAD models has been used for monocular scene understanding, which also allows reasoning about mutual occlusions between objects.