A digital video sequence can contain a very large amount of data. In order to transfer a video efficiently using current technology, a large transmission bandwidth is needed. However, wireless data transmission bandwidth is a limited and sometimes expensive resource. Consequently, it is desirable to use compression techniques to encode the video using fewer bits than the original video contains. The compressed video should effectively reduce the bandwidth required to transmit the video via networks.
It is desirable to compress a video in a highly efficient way so that the video can be transmitted through an ultra-low bit channel, such as a SMS channel. Short message service (“SMS”), sometimes called “texting,” is one of the most popular person-to-person messaging technologies in use today. SMS functionality is widely available in almost all modern mobile phones. However, SMS has a very limited capacity to transmit information; each SMS message has a fixed length of 140 bytes or 160 characters. Multimedia messaging service (“MMS”) is another possible way to send messages that include multimedia content. However, MMS messaging cannot utilize existing SMS infrastructure; so it costs more than SMS messaging. There is no mechanism today to effectively send a video message with ultra-low bandwidth on wireless channel, particularly on very low bandwidth channels such as an SMS channel.
For speech compression, a large body of work in the past four decades has been done analyzing various concepts in speech compression. In a typical voice compression technique for wireless communications such as ITU G.723, FS MELP, a voice record is analyzed for its correlation property in the acoustical sense. The speech compression programs are typically based on waveform coders, such as Code Excited Linear Prediction (“CELP”) algorithm. While a number of approaches in the past have resulted in very low bit rates 200-300 bps, the voice quality has been compromised. The mean opinion score (“MOS”) factor of compressed voice record typically is about 2, wherein the MOS provides a numerical indication of the perceived quality from the users' perspective of the voice record after compression. It is desirable to have a method to compress and transmit the voice record in a very low bit rates while maintaining a high voice quality, with an MOS factor between 4 or 5.