Desktop sharing has become an important feature in current collaboration software. It allows virtual meeting attendees to be viewing the same material or content (video, documents, etc.) during a discussion. To make desktop sharing possible, the screen content that is being shared by the sending computing device during a collaboration session must be continuously captured, encoded, transmitted, and finally rendered at receiving computing devices for display.
Desktop sharing applications can compress screen content into H.264 standard video bit streams. The screen content being shared is typically treated as ordinary camera captured video, where frames of the screen content are encoded utilizing intra-frame and/or inter-frame encoding techniques. By finding a suitable match between a current frame and a previous/reference frame, redundancies in encoding of the current frame (or portions of the current frame) can be avoided, since the coding of the reference frame (or portions thereof) can be used as a reference for the current frame, thus minimizing the coding and decoding of content that is required for the sharing of content between two or more computing devices.
However, screen content video has features and characteristics that can be different from camera video, such as the frequent page switching, scrolling back and forth within certain types of content (e.g., text documents), etc. For screen content coding using video compression, enabling multiple reference frames can greatly benefit compression efficiency, because screen content video can be efficiently encoded by inter-frame prediction using a proper reference. However, since the frame resolutions in screen content coding are relatively large, there is substantial complexity in searching among multiple reference frames in order to find the best match for the current frame.