Media content such as image, graphics, text, audio and video data or any combination thereof can be delivered across communication networks and rendered on user terminals including media players and multimedia players.
Multimedia players are devices that render combinations of video, audio or data content for consumption by users. The rendering or reproduction of the media content may be performed by visible display, audio sound etc. When different media content entities are delivered to a user terminal in the form of multimedia, it is important to determine timing synchronization and the display positions of the media content components for effective consumption and presentation.
MPEG-H part 1 standard (also known as MPEG Multimedia Transport or MMT), for example defines a solution for packaging, transport and composition of timed and non-timed media content, such as for example, image data, audio data, text data and the like. While MMT primarily addresses IP networks, it also supports delivery of content over any type of packet-based networks. In particular, MMT may be used for delivery of audiovisual services over broadcast networks such as terrestrial, cable or satellite networks.
In MMT, the term “Asset” refers to a data entity containing data with the same transport characteristics and that is composed of one or more MPUs (media processing units) with same Asset ID, the term “Package” refers to a logical collection of data, which is composed of one or more Assets and their related Asset Delivery Characteristics (i.e., description about required Quality of Service for delivery of Assets), and Composition Information (CI) (i.e., description of spatial and temporal relationships between Assets).
MMT-CI (where CI refers to composition information) enables content providers to define the initial display composition of media data on a display and how the initial composition evolves with time. Indeed, MMT-CI specifies the relational arrangement between MMT assets for consumption and presentation and defines the following:                spatial composition, i.e. the location of each asset rendered in a view;        temporal composition i.e. the timing synchronization between media assets—evolution in time of the spatial composition;        multiple views for supporting simultaneous display or virtual displays on one or more display devices        
The initial spatial composition is defined in a spatial composition data set, such as a HTML page associated with a MMT package, and the evolution in time of the composition is defined in a temporal composition dataset by means of CI elements described in XML fragments; The XML fragments may be sent separately over time.
The initial HTML page and the subsequent MMT-CI fragments are all associated with the same MMT package and are typically stored in the same package. While this does not present any issue with stored and finite duration media content, such system are unlikely to work with infinite duration contents (such as TV channels) since the HTML5 page is likely to significantly change from one program to another within the same TV channel. A potential solution could be to signal a TV channel as a succession of MMT packages (TV events, movies, ads, . . . ) each with their own HTML page and CI fragments but such an approach would result in significant amounts of signaling information and therefore a much higher receiver complexity.
Moreover, if CI is used for a broadcast channel, it is expected that many CI fragments will be sent. If the display information is very rich, both the HTML page and the CI fragments can be significantly large and a high number of areas per view multiplied by a high number of views would require significant processing in the receiver.
The present invention has been devised with the foregoing in mind.