1. Field of the Invention
The present invention is directed to processing information received in Real Time Protocol (xe2x80x9cRTPxe2x80x9d) packets.
2. Background Information
In packet networks, one cannot predict a packet""s time of arrival from its time of transmission. One packet may reach its destination well before a packet that was transmitted previously from the same source. This is a difference between packet-switched and circuit-switched networks. In circuit-switched networks, a channel is dedicated to a given session throughout the session""s life, and the time of arrival tracks the time of transmission.
Since the order in which data packets are transmitted is often important, various packet-network transport mechanisms provide the packets with sequence numbers. These numbers give the packets"" order. They do not otherwise specify their relative timing, though, since most transmitted data have no time component. But voice and video data do, so protocols have been developed for specifying timing more completely. A protocol intended particularly for this purpose is the Real-Time Transport Protocol (xe2x80x9cRTPxe2x80x9d), which is set forth in the Internet Community""s Request Comments (xe2x80x9cRFCxe2x80x9d) 1889.
For the sake of example, FIG. 1 shows the format of an Ethernet frame that carries an RTP packet, although not all such packets are sent over Ethernet links. FIG. 1 shows the Ethernet frame""s header and trailer fields, which are separated by the Ethernet frame""s payload. The Ethernet header not only identifies the frame""s (same link) recipient node but also includes a de-multiplying field, which specifies the destination-node process that is to use the Ethernet frame""s payload.
Again for the sake of example, we assume that the Ethernet header""s de-multiplexing field has specified that the Ethernet frame""s payload is an Internet Protocol (xe2x80x9cIPxe2x80x9d) packet, as the IP header""s presence in FIG. 1 indicates. The IP process""s purpose is to determine where to forward the IP packet""s payload; ordinarily, the recipient node is simply to forward that payload and not otherwise use it. It is only the ultimate-destination node that uses the IP packet""s payload. The IP header includes a de-multiplexing field, which enables the destination node to decide which of its processes is to use the payload. We assume for present purposes that the de-multiplexing field specifies the User Datagram Protocol (xe2x80x9cUDPxe2x80x9d) process.
That process is a transport process: it selects a further process, specified by the UDP header""s own de-multiplexing field, to which the UDP packet""s payload is to be delivered. The contents of the UDP header""s de-multiplexing field are known as a xe2x80x9cportxe2x80x9d number, and the situation of particular interest here is the one in which that port number has been assigned to an RTP session.
When it has, the receiving process interprets the first part of the UDP payload as an RTP header, whose format FIG. 2 depicts. The RTP-header field of particular interest here is the timestamp field. The data being transmitted can usually be thought of as samples of a function. Each video frame is a sample from a moving scene, for instance, and audio data typically result from sampling sound pressure. Timestamps represent the relative times at which the transmitted samples were taken.
At its page 11, RFC 1889 discusses specific approaches to generating RTP timestamps. One way is for the transmitting node initially to select a random number and note the offset between that number and the current local xe2x80x9cwall-clockxe2x80x9d time. It thereafter uses this offset to calculate from wall-clock time the values of the timestamps it places in respective RTP packets"" headers.
As was explained above, the node participating in an RTP session will have allocated a port number to that session, and it will interpret UDP payloads delivered to that port as having the just described RTP format. But it needs further, control information to support the session, and it additionally allocates an adjacent-numbered port to the session for that purpose. UDP payloads delivered to this port will be interpreted as RTCP packets, whose formats RFC 1889 also describes. Of the several RTCP message formats, FIG. 3 depicts the sender-report format, which is of particular interest to the present discussion. The contents of that packet""s PT field identify it as having that format, which contains two types of timestamp fields.
The RTP timestamp field is the same as the timestamp field included in FIG. 2""s RTP format; it is calculated from wall-clock time with the offset used for the same session""s RTP packets. Its value represents the time at which the RTCP packet was sent. The other, NTP timestamp field is larger and contains the sender""s wall-clock time from which the RTP-timestamp value was computed. The other session participant may use these values to make judgments concerning packet delay, jitter, and so forth.
The various nodes that receive RTP packets can use their timestamps to time outputs that they generate from the data that the RTP packets contain. Various recipients"" outputs can have different forms, and the ways in which they base those outputs on incoming timestamps vary, too.
Consider the teleconference arrangement that FIG. 4 depicts, for example. A multipoint-conference unit 12 conducts RTP sessions with various nodes 14, 16, 18, and 20. Multipoint-conference unit 12 receives input in RTP packets from one or more of the participant units 16, 18, 20, and 22. From the data thereby received it generates data that it sends in RTP packets to the various participant units. For example, it may receive audio and video from several of the units, identify the unit that is generating the loudest audio, and forward that unit""s video output to all of the other units while it sends them a mixture of all units"" audio in respective parallel sessions. By employing the timestamps, each unit can determine the relative timing at which to apply the data to the screens and speakers: it can base actual output-signal production on the timestamps it receives.
The same is true of gateway 14. It receives video and audio data in RTP sessions with the multipoint conference unit 12 and forwards the data over an ISDN line to a further conference terminal 24. Since it is using an ISDN line, gateway 14""s output-data timing must be implicit from the time at which node 24 receives the data so gateway 14 bases the timing of its transmissions on the input timestamps.
The situation is different in the case of the multipoint conference unit 12. It neither presents the data in screens and speakers, as nodes 16, 18, 20, and 24 do, nor sends it by way of an implicitly-timed channel. Instead its output takes the form of packets sent in the various RTP sessions. These packets"" times of transmission are not necessarily determined by the timestamps on the packets in which unit 12 received the data from which it generated its output. So unit 12""s output timing takes the form of the outgoing packets"" timestamp values, and it is these values that the multipoint conference unit 12 bases on the timestamps that it receives.
RTP thus provides a versatile way of sending time-based data and can serve as foundation for using packet-switched networks for information of that type. But it has turned out that a lot of equipment that ostensibly uses RTP does not employ timestamp values effectively. In many cases this is because the equipment applies and/or uses the timestamps incorrectly. In other cases the equipment uses timestamps as intended but still performs poorly because it communicates with other equipment that does not. This has made RTP use less attractive than it should be.
But I have recognized that performance can be improved as a practical matter by basing output timing on input timestamp""s only selectively, in accordance with tests that are made by comparing various packets"" timestamps.
There are a number of tests that can be employed for this purpose. One is based on observing the progressions of the RTCP packets"" RTP and NTP timestamps. If the amounts by which the RTP timestamps advance from one RTCP packet to the next differs excessively from the amount by which the same packets"" NTP timestamps do, then the timestamps are likely unreliable, and the output timing should be based on the incoming packets"" time of arrival rather than the timestamps that they carry. Other indications that timing should occur on a time-of-arrival basis rather than on a timestamp basis may include the fact that the estimated transmission delay is too great or varies excessively. Yet another may be that the timestamp in an RTCP message represents a time before those that too many previously received timestamps represent.
These criteria may be employed in a number of ways. For example, the time-of-arrival approach may be employed when a session begins, and the node may then convert to the timestamp approach after particular ones of these criteria have been met. As another example, the node may monitor its incoming traffic during the use of the timestamp approach to determine whether it meets those and/or other criteria. If it does not, or if it fails to meet those criteria too frequently, then the node can revert to the time-of-arrival approach.
By using criteria such as those described above, a receiving node can benefit from RTP timestamps but avoid much of the adverse performance impact in which improper timestamp application can result.