Digital imaging systems acquire data for each pixel (picture element) in an array of pixels, typically a two-dimensional array, by digitally converting the intensity of light impinging on the pixel into electrical voltage and storing its value in binary bits (e.g., 8 or 16 bits per pixel). The image data (or simply image) can be in grey or color scale (RGB, for example).
Machine or computer vision algorithms operating on image data from sensors (e.g., digital cameras) are key to a huge variety of innovations, such as autonomous driving, automated manufacturing, robotics, surveillance, interactive devices using augmented or virtual reality, and so forth.
For typical consumers, computer vision algorithms and display screens use image data for crisp still images and movies. For these purposes, although the data can be compressed, using formats such as jpeg, gif, mpeg or mp4, the primary goal of such compression is to compress data with minimal sacrifice of image quality. For high fidelity images, lossless compression is preferable to lossy compression. Similarly efficiency of vision algorithms operating on data must also preserve high image quality for consumer applications.
In contrast, for industrial machine vision applications—machine vision and computer vision are synonymous in this document—image quality is less important than for consumer applications. In many industrial applications, the first processing steps after image pre-processing are focused on reducing the redundancy in data. Even though these redundancy suppression algorithms amount to a form of data compression, they are different from compression algorithms for consumer applications where the aim is to sacrifice as little image quality as possible. The goal of the redundancy suppression steps in computer vision is to reduce the volume of data to what is actually needed to perform a needed task. Such redundancy suppression steps include color or orientation histograms or feature detection steps such as edge, corner or blob detection and description.
Having a complete digital image frame in memory and processing digital data pixel by pixel is usually a performance bottleneck for computer vision which is why using parallel architectures such as FPGAs, GPUs or dedicated visual processors (e.g., Movidius Myriad 2 or Microsoft HPU) often result in significant performance improvements.