There is a technique typically referred to as SMIL (Synchronized Multimedia Integration Language). This technique has been standardized by W3C (World Wide Web Consortium) as a technique for describing spatial and temporal layout by combining contents that contain text, static images, video, and sound. The SMIL description language differs from the Hyper Text Markup Language (HTML) in that time information is included in the content. Incidentally, the HTML description language is the most popular language for use on the Internet.
Now a method will be explained here whereby a client plays back content that is on a server and made in the SMIL description language, via a network.
FIG. 1 is a diagram illustrating content distribution using SMIL. In this figure, client 201 accesses server 203 over network 200, acquires an SMIL file in which content is described, and interprets the acquired SMIL file. Next, the media described in the SMIL file, such as text, static images, video, and sound, is acquired from server 202 and server 204. Then, based on the time information described in the SMIL file, each media (text, static images, video, and music) plays back at appropriate time. Incidentally, the SMIL file, the sound and video, and the text and static image, that in this figure are stored in respective servers 202, 203, and 204, can be stored in one server.
Now, the transmission method for the SMIL file and for each media will be explained.
The communication protocol typically used to transmit media files including SMIL files, static image files, and text files from servers 202, 203, and 204 to client 201 is referred to as TCP (Transmission Control Protocol). This TCP is a reliable protocol as HTML and is widely used on the Internet. In contrast, the communication protocol frequently used to transmit temporally continuous data such as sound data and video data is RTP (Real-time Transport Protocol) and UDP (User Datagram Protocol). As for the lower protocol to transmit TCP or RTP/UDP, the IP (Internet Protocol) protocol is common.
The above TCP, RTP, UDP, and IP protocols are all standardized by the IETF (Internet Engineering Task Force) and spread widely on the Internet.
Next, a content description method for SMIL files will be briefly explained.
FIG. 2 is a diagram showing sample sentences described in SMIL. In this figure, the numbers on the left end (1, 2, . . . ) are the line numbers provided for clarification, and on the right end are explanatory sentences. These numbers and explanatory sentences are not described in the actual SMIL file.
The text surrounded by the number 1 and the number 15 is the SMIL document, and the contents thereof consists of a header portion shown between the number 2 and the number 8 and the body text shown between the number 9 and the number 14. In the header portion, layout information is described, which does not relate to time information. In the body text, time information is described, which relates to the actual media data and playback. The description shown by the number 11 is a control sentence for displaying video. In addition, the description shown by the number 12 is a control sentence for displaying static images. These are surrounded by <par>'s shown by the numbers 10 and 13. The parts surrounded by these <par>'s indicate that they play back at the same time. The present example indicates that the video and the static image play back at the same time. The location of media is described by “src.” In addition, specification as to the playback time of media is described by “begin,” “end,” “dur,” and such. In this case, “begin” specifies the media start time, “end” specifies the media end time, and “dur” specifies the media playback time.
In addition, the description shown by the number 11 indicates that the video data specified by “src” will be displayed for 3 to 20 seconds in the region shown as “a.” The description shown by the number 12 indicates that the static image data specified by “src” will be displayed for 10 seconds in the region shown as “b.” The playback start and end are specified by absolute time as in the descriptions shown by the numbers 11 and 12; however, they can be specified by means of sequence, as shown in below Examples (1) and (2).