Currently, there is a technology called SMIL (Synchronized Multimedia Integration Language) as a method of integrating media including text data, still-image data, video data, voice data, and music data, and describing its spatial and temporal arrangement, standardized by the W3C (World Wide Web Consortium).
Here, still-image data denotes JPEG, PNG (Portable Network Graphics format) or suchlike bit-mapped compressed data and SVG (Scalable Vector Graphics) or suchlike vector type compressed data; video data denotes compressed data using the MPEG-4 or H.264 system; voice data denotes compressed data using the G.729 or AMR (Adaptive. Multi-Rate) system; and music data denotes compressed data using the MP3 or MIDI system.
Hereinafter, text data, still-image data, video data, voice data, and music data are referred to generically as media data.
SMIL is a descriptive language similar to the hypertext descriptive language HTML (Hyper Text Markup Language) currently extremely widely used through the Internet, and is better suited to distributing multimedia data including video data.
Hereinafter, a method will be described, using the accompanying drawings, whereby multimedia data described by means of SMIL and held on a server is distributed to a client via a network.
FIG. 1 is a configuration diagram of a conventional media data distribution system. In this diagram, server 11 stores SMIL files, server 12 stores audio files and video image files, and server 13 stores text files and still-image files. Servers 11 through 13 and client 15 are mutually connected via network 14.
Client 15 accesses server 11 storing SMIL files using a communication protocol such as HTTP (Hyper Text Transfer Protocol), and acquires an SMIL file in which media are described. Client 15 decodes the acquired SMIL file, and acquires the respective media data stored therein—that is, text data, still-image data, video data, audio data, and so forth. Specifically, video data and audio data are acquired from server 12, and text data and still-image data are acquired from server 13.
Based on space and time information written in the acquired SMIL file, client 15 plays back the respective media data (video data, audio data, text data, and/or still-image data) at an appropriate location and appropriate time. SMIL data and the various kinds of media data may also be stored in the same server.
Next, a sample SMIL file description will be described using an accompanying drawing. FIG. 2 is a drawing showing an example of a conventional SMIL file description. In this drawing, <smil> in line 1 indicates that this is an SMIL document, and </smil> in line 18 indicates the end of the SMIL document.
The area from <head> in line 2 to </head> in line 9 is an area in which time-unrelated information is written, and here, information showing the spatial layout of the media is written in the area from <layout> in line 3 to </layout> in line 8.
The area from <body> in line 10 to </body> in line 17 is an area in which time-related information is written, and here, information showing the media playback time is written in the area from <par> in line 11 to </par> in line 16.
To give a more concrete explanation, lines 5 through 7 define areas v, t, and i in which video data, text data, and still-image data respectively are positioned, and lines 12 through 15 define time information for playback of video data, voice data, text data, and still-image data respectively.
Item “src=” in each of lines 12 through 15 specifies a URL (Uniform Resource Locator) for acquiring media data. In this example, video data and voice data are specified as being acquired by means of the RTSP (Real Time Streaming Protocol) protocol and text data and still-image data are specified as being acquired by means of the HTTP protocol.
Item “region=” specifies a location where media data is displayed, and corresponds to a region specified inline 5 through line 7. For example, text data specified in line 14 has a “region=t” specification, and therefore corresponds to the region specified in line 6.
By using SMIL in this way, a content provider can freely describe the layout of media to be distributed.
Next, a method of changing the layout of distributed media during reception of distributed data will be described. According to Patent Literature 1, various control buttons for moving or eliminating (erasing) media being displayed are provided as appropriate in a screen displaying media. By manipulating these buttons, a user can change the layout smoothly without interrupting the display of information.
Patent Literature 1: Unexamined Japanese Patent Publication No. 2002-312090