Many techniques are known in the art to deal with compression and decompression of multidimensional signals or of signals evolving along time. This is the case of audio signals, video signals and other multidimensional signals like volumetric signals used in scientific and medical areas. In order to achieve high compression ratios, those techniques exploit the spatial and time correlation inside the signal. Conventional methods identify a reference and try to determine the difference of the signal between a current location and the given reference. This is done both in the spatial domain, where the reference is a portion (e.g., a block, or “macro-block”) of already received and decoded spatial plane, and in the time domain, where a single instance in time of the signal (e.g., a video frame in a sequence of frames) is taken as a reference for a certain duration. This is the case, for example, of MPEG-family compression algorithms, where previously-decoded macro blocks are taken as reference in the spatial domain and I-frames and P-frames are used as reference for subsequent P-frames in the time domain.
Known techniques exploit spatial correlation and time correlation in many ways, adopting several different techniques in order to identify, simplify, encode and transmit differences. In conventional methods, in order to leverage on spatial correlation of residuals within a block a domain transformation is performed (for example into a frequency domain) and then lossy deletion and quantization of transformed information is performed, typically introducing some degree of block artifacts. In the time domain, instead, conventional methods transmit the quantized difference between the current sample and a motion-compensated reference sample. In order to maximize the similarity between samples, encoders try to estimate the modifications along time occurred vs. the reference signal. This is called, in conventional encoding methods (e.g., MPEG family technologies, VP8, VP9, etc.), motion estimation and compensation.
Encoding methods in the known art, aside from few attempts, typically neglect the quality scalability requirement. A scalable encoding method would encode a single version of the compressed signal and enable the delivery to different levels of quality, bandwidth availabilities, and decoder complexity. Scalability has been taken into consideration in known methods like MPEG-SVC and JPEG2000, with relatively poor adoption so far due to computational complexity and, generally speaking, their similarity with non-scalable techniques.
Since MPEG-based technologies (e.g., MPEG2, MPEG4, H.264, H.265) are international standards, several dedicated hardware chips were developed in order to perform signal decoding with dedicated hardware blocks. It is thus difficult for different encoding technologies to gain adoption, due to the lack of a decoding device ecosystem.
In other cases of video transmissions, such as for instance the cable transmission to display devices via transmission methods such as HDMI or DisplayPort, the transmission of video content to decoding/display devices is constrained by the capacity of the transmission cable. This makes it impossible to transmit video content above a given level of quality (either resolution or frame rate) due to the constraints of the transmission cable. Since the amount of data to transmit is becoming larger and larger over time (due to the continuous increase of resolutions and frame rates supported by commercial display devices), the constraints posed by connection cables are becoming relevant issues, often forcing decoding/display devices to perform various kinds of interpolations (e.g., frame rate interpolations from 60 Hz to 240 Hz) in order to make up for the insufficient capacity of the transmission cable in order to cope with the levels of quality that they would be able to display.
In other cases of video transmission, such as for instance video conferencing, a large installed base of decoder devices is only able to decode legacy SD and/or HD video content, while newer and more powerful telepresence systems can decode video content at much higher resolutions at quality. Current methods make it impossible with a single encoded data stream (i.e., without encoding/transcoding into multiple distinct video streams) to serve both legacy decoder devices and newer decoder devices.
In other cases of video distribution, such as for instance Blu-ray discs, a large ecosystem of devices is only able to decode legacy HD video encoding formats, while new decoding devices are able to decode and display UltraHD video. Current methods make it impossible to distribute a single legacy-compatible Blu-ray disc that can be read as HD video by the wide installed base of legacy devices and as UltraHD video by new decoding devices.