The specification relates to image processing. In particular, the specification relates to inferring scenes from images.
Existing solutions for image analysis often rely on computationally expensive methods such as object recognition methods, pixel-level segmentation, or scanning a detection window over an image, etc. During object recognition, these existing approaches often analyze the pixel data in the image to determine how the image should be segmented, which requires significant processing time and can thus introduce latency or lag that can annoy users. As a result, it is generally impractical to use these solutions on mobile computing devices to analyze video streams being captured by those devices in real-time.
Many existing vehicular video systems provide little or no interpretation or analysis on images captured by them, such as images captured by current rear-view cameras. These systems may overlay the road geometry with images to highlight various aspects (e.g., a footprint of a vehicle when parking), but do so without interpreting the scene depicted by the images. Also, some systems may store or share images with other vehicles in compressed or reduced form, but they generally do not provide analysis or interpretation for the images. Other specialized vehicular video systems may identify specific driving-relevant features in the image, such as lane boundaries or crosswalks; they generally do not provide a holistic scene-level analysis to characterize the image.