In a video camera an image sensor is used to acquire an image as acquired via imaging optics of the video camera. The image sensor is typically a matrix of pixels sensitive to radiation, typically in the form of light.
The raw image as read from the image sensor is usually not fit for direct display, and there are several reasons for this, so the image is subjected to substantial processing before it is forwarded for display. The general purpose of the video camera is to acquire an image and to prepare the image for viewing. In a video camera as used herein the camera is mostly used for monitoring operations, such as surveillance. In such a camera the image leaves the camera as one frame in a video stream, and as such the camera will include an encoder preparing and forwarding the video stream.
The processing steps may include operations performed on the image as such, e.g. demosaicing, balancing intensities, balancing colors, correcting for image distortions, and furthermore the image may be resized, rotated and finally processed in the encoder. The mentioned steps are examples only, and not given in any particular order.
When processing the image use may be made of metadata, e.g. data deduced from the raw image. Just to give a few relevant examples, the metadata may concern:
The signal to noise ratio (SNR) for various portions of the image. SNR data may be used to configure or change filters inside the camera, such as noise filtering, and it may also be used to trigger external lights for improvement of light conditions.
Identification of regions where motion has been detected are typically identified if the video camera is used for monitoring or surveillance purposes, where a change in the image typically identifies an event of interest.
Identification or preset regions of interest (ROI) of particular interest for the image processing (or identified by a user as being particularly interesting), such as a face, a particular shape etc.
A final example of this type of metadata relates to a compression map for the image. A compression map may be a table provided to an encoder to change its compression parameters spatially, and it could relate to a compression level, a table with constants and thresholds or constants for block type decisions. By comparing the image to previous images a map indicating how the image may be encoded according to a particular protocol may be generated.
The above examples of metadata may be extracted from the raw image as it has been read from the image sensor, and is usable for downstream processes.
Metadata does not have to comprise information extracted from the image to be considered as metadata in the context of the present disclosure. An example of this type of metadata may be related to various masks used for correction or adjustment of the image at a later stage in the image processing. Another example may related to a region of interest preset by a user. The metadata may also relate to user-defined regions of interest, privacy masks, priority regions (a map of where image quality should be prioritized if needed), and information relating to settings of the camera or input from sensors; zoom level, shutter speed, tilt sensors.
The imaging optics of a video camera will most often introduce a certain degree of distortion to the image. Some common examples are barrel distortion and pincushion distortion. Other types of distortion may include chromatic aberration, monochromatic aberration, and related subgroups.
The distortions will imply that the image as read from the image sensor is not fit for display in its present form; there may be a need for a transformation prior to displaying the image.
The image processing described may be performed in and encoder, transcoder or in a video management system, to mention a few alternatives to a video camera, i.e. the processing may be the same or similar irrespective of whether the image is read from an image sensor or provided from a file.