In modern communications systems a video signal may be sent from one terminal to another over a medium such as a wired and/or wireless network, often a packet-based network such as the Internet. Typically the frames of the video are encoded by an encoder at the transmitting terminal in order to compress them for transmission over the network. The encoding for a given frame may comprise intra frame encoding whereby blocks are encoded relative to other blocks in the same frame. In this case a target block is encoded in terms of a difference (the residual) between that block and a neighbouring block. Alternatively the encoding for some frames may comprise inter frame encoding whereby blocks in the target frame are encoded relative to corresponding portions in a preceding frame, typically based on motion prediction. In this case a target block is encoded in terms of a motion vector identifying an offset between the block and the corresponding portion from which it is to be predicted, and a difference (the residual) between the block and the corresponding portion from which it is predicted. A corresponding decoder at the receiver decodes the frames of the received video signal based on the appropriate type of prediction, in order to decompress them for output to a screen. A generic term that may be used to refer to an encoder and/or decoder is a codec.
Prior to prediction coding the samples of each bock are typically quantized in order to reduce the bitrate incurred in encoding the block. Quantization refers to the process of taking samples represented on a relatively large scale or from amongst values of a relatively large set, and converting them to samples represented on a relatively small scale or from amongst a relatively small set (which may be referred to as the quantization levels). For instance quantization may refer to the process of converting an effectively continuous variable (e.g. a digital approximation of a continuous variable) into variable constrained to a set of substantially discrete levels. The granularity of the quantization refers to the size of the spacing between the possible quantized values of the scale or set from which samples to be represented are constrained to being selected, i.e. the size of the steps between quantization levels. This may also be described as the coarseness or fineness of the quantization. Depending on the granularity, the quantization introduces some distortion into the representation of a video image but also reduces the number of bits required to represent the image.
Some video codecs such as those designed according to the H.264 standard allow quantization granularity to be set as a parameter of the encoding (and signalled to the decoder in the form of side information transmitted along with the encoded bitstream). It is also possible to define a region-of-interest (ROI) within the area of the video frames, and to set a difference in quantization parameter inside and outside the ROI defined by a fixed quantization parameter offset. A codec designer can potentially use the ROI to cover any region of the video where it is desired to spend more bits on better quality. One possible use is to cover the face or facial features. For example this way more of the potentially limited bandwidth available for transmitting the video over a network can be spent on providing quality in the ROI while relatively few bits need be spent encoding the background and/or regions of lesser significance.