1. Technical Field
The invention is related to the real-time transmission of media data from a sender to a receiver over a packet-based network.
2. Background Art
Real-time media, such as radio and television programs, are broadcast from a single sender to multiple, geographically distributed receivers, who have all xe2x80x9ctunedxe2x80x9d to that sender. Commonly, the signals are broadcast from the sender by a terrestrial antenna, but satellite and wired solutions also exist. For example, in cable TV, the signals are broadcast from a sender by propagating a voltage along a coaxial cable to receivers connected to the cable.
It is also possible to use the Internet infrastructure to broadcast audio and video information. This can be accomplished using the Internet Protocol (IP) Multicast mechanism and its associated protocols. An Internet broadcast (or more properly, xe2x80x9cmulticastxe2x80x9d) is provided to the set of receivers who have first xe2x80x9csubscribedxe2x80x9d to the information. Specifically, through an announcement mechanism, such as a web page, a broadcaster announces the IP multicast group address to which it will send a particular broadcast. The multicast group address is just a special case of an ordinary IP address. However, unlike an ordinary address which is used to identify the xe2x80x9clocationxe2x80x9d of a receiver where data is to be sent, a multicast group address is used by routers in the network to identify data being transmitted on the network as part of the broadcast, so that it can be routed to a subscribing receiver (who will have a completely different address). The receiver""s address is not included in the broadcasted information. A receiver subscribes to the broadcast by notifying the network that it wishes to xe2x80x9cjoinxe2x80x9d the multicast group. The subscriptions cause various routers in the network to update their states, to ensure that the multicast information eventually reaches the subscribers. At some point the sender begins to send packets to the specified address. When a router receives a packet with that address, it sends copies of the packet through each outgoing interface that leads to a subscriber. This causes the packets to reach the subscribers at some point, albeit with the inevitable packet loss due to network congestion and buffer overflow.
Alternately, the audio and video information could be sent directly to a receiver using its Internet address. This form of audio and video data transfer would more accurately be referred to as a real-time unicast multimedia presentation, rather than a real-time broadcast or multicast multimedia presentation.
When data is transferred over a network, and particularly over the Internet, the channels between the sender and each receiver can vary dramatically in capacity, often by two or three orders of magnitude. These differences in capacity exist because the data transmission rates associated with the connections to a particular receiver can vary (e.g., phone line capacity, LAN and/or modem speeds). This heterogeneity in capacity can cause problems in the context of a unicast or multicast presentation of real-time audio and video information. For example, a particular receiver may not have the bandwidth available to receive the highest quality transmission that a sender is capable of providing. One early attempt to cope with this problem involved simulcasting or transmitting the audio and video data at different transmission rates to different receivers, with the quality being progressively better in the data transmitted at the higher rates. The receiver then received the transmission that suited its capability. However, this solution was very storage or bandwidth intensive as much of the same information had to be repeated for each transmission rate. To overcome this problem, audio and video information can be transmitted via a layered unicast or multicast presentation.
In a layered unicast or multicast presentation, audio and video information is encoded in layers of importance. Each of these layers is transmitted in a separate data stream, which are in essence a sequence of packets. The base layer is an information stream that contains the minimal amount of information, for the least acceptable quality. Subsequent layers enhance the previous layers, but do not repeat the data contained in a lower layer. In order to obtain the higher quality, a receiver must receive the lower layers in addition to the higher layers that provide the desired quality. Thus, the layers are hierarchical in that there is at least one base layer (typically one audio base stream and one video base stream), and one or more additional higher level layers. There can in fact be several hierarchical layers building up from a base layer with each subsequent layer being dependent on the data of one or more lower level layers and enhancing those lower level layers. In particular, it is possible to have two or more enhancement layers depending on the same lower level layer, but not depending on each other. Each of the layers would enhance the lower level layers on which they depend in a different way. For example, a stream in one higher-level layer might include data that enhances the frame rate of a foreground object in a preceding lower-level layer, while a separate higher-level layer might increase the resolution of a background object in this lower-level layer. A receiver may use either such enhancement layer without the other, or may use both such enhancement layers. However, a receiver may only use an enhancement layer if it also receives all of the layers on which it depends either directly or indirectly.
In addition, there can also be one or more error correction layers associated with each base and enhancement layer in the unicast or multicast presentation. For example, such a layered error correction technique was described in a co-pending U.S. patent application entitled xe2x80x9cRECEIVER-DRIVEN LAYERED ERROR CORRECTION MULTICAST OVER HETEROGENEOUS PACKET NETWORKSxe2x80x9d, which was filed on May 21, 1999, and assigned Ser. No. 09/316,869.
In a layered unicast or multicast presentation scheme, a receiver can subscribe to as many layers as it wants, provided the total bandwidth of the layers is not greater than the bandwidth of the most constrained link in the network between the sender and the receiver. For example, if the receiver is connected to the Internet by a 28.8 Kbps modem, then it can feasibly subscribe to one, two, or three 8 Kbps video layers. If it subscribes to more than three layers, then congestion will certainly result and many packets will be dropped randomly, resulting in poor video quality. Given this, a question arises as to which ones of the available layers should be transmitted to a receiver in view of the existing bandwidth constraint.
If the dependence between layers is sequential, for example, if a second layer depends on a first layer, a third layer depends on the second layer, and so on, then it is a simple matter to decide which layers to transmit: transmit the layers in order up to the bandwidth constraint. This maximizes quality subject to the bandwidth constraint. However, in many situations the dependence between layers is not sequential. For example, suppose both the second and third layers depend directly on the first layer, but not on each other. In this case there may be multiple sets of layers that can satisfy a given bandwidth constraint. Thus, the question arises as to which set of layers should be transmitted to a receiver.
The present invention is directed at resolving the issue of determining which ones of the available data layers should be transmitted to a receiver in view of a bandwidth constraint and in view of the dependences between them. In its most general terms the system and process according to the present invention involves tagging prescribed portions of the data of each layer in a layered unicast or layered multicast presentation with an indicator of the importance or utility that the data provides to the receiver, and with an indicator of its bandwidth or cost of the data. Together with a graph of the dependences between layers, these indicators can be used to select the set of layers that should be transmitted to the receiver to maximize utility and minimize cost. The aforementioned portions of the data can be an entire data stream of a layer, or some part thereof all the way down to the individual packets making up the stream.
The importance or utility associated with the tagged data refers to its benefit to the receiver. For example, the utility could be couched in terms of how much the quality is increased, or the distortion is decreased, or the resolution is improved by the addition of the tagged data. The cost associated with the tagged data refers to the cost to transmit it. For example, the cost could be couched in terms of the bandwidth of the tagged data if it is a stream, or the size in bits of the tagged data if it is a packet. For the purposes of the description that is to follow, the tagged data will be assumed to be a single packet.
Once the packet of each layer has been tagged with indicators of its contributions to the overall utility that can be realized and the overall cost that can be incurred should the packet be transmitted, all that is left to do is to determine which combination of layers provides the greatest overall utility within the overall cost constraint associated with the time period it will take to transmit the selected combination of layers. It is noted that the utility and cost contribution indicators associated with the packets that are to be transmitted concurrently are additive. Thus, the overall utility of a set of packets being considered for transmission is simply the sum of the utility contribution indicators associated with the packets, and the overall cost is simply the sum of the individual costs associated with the packets. Of course, when considering which layers to include in the transmission it must be remembered that some layers are dependent on other layers. Thus, in order to select a particular layer for transmission, packets from all of the layers in the chain of layers that the selected layer is dependent on back to the base layer must be transmitted as well, and so considered in the process of determining the combination of packets that maximize the overall utility, while not exceeding the overall cost of the transmission.
In regard to the aforementioned analysis, several different methods could be employed. In order to better understand each of these methods, the previously described hierarchical layer structure and the dependencies between the layers can be represented graphically using a directed acyclic graph (DAG) whose nodes represent the available layers (and particularly the aforementioned packet thereof and whose links represent the dependencies between the layers. In particular, a link directed from one layer to a second layer represents the direct dependence of the second layer on the first layer. In this case, the first layer is said to be the parent of the second layer, and /the second layer is said to be the child of the first layer. A layer may have zero or more children. A layer may also have zero or more parents. A layer with zero parents is said to be a base layer. Otherwise it is said to be an enhancement layer. There must be at least one base layer, but in general there may be more than one base layer. A layer with zero children is said to be a leaf layer. A DAG with one base layer and one leaf layer, in which every layer has at most one parent and at most one child, is said to be sequential. A DAG in which every layer has at most one parent (but may have more than one child) is said to be a tree. Hence a tree is a type of DAG and a sequential DAG is a type of tree. A DAG in which some layer has two or more parents is said to be a multi-dependence DAG. If one layer is an ancestor of a second layer (that is, is its parent, its parent""s parent, or so on), then the first layer is said to be at a higher level than the second layer, and the second layer is said to be at a lower level than the first layer. The foregoing DAG representation of a layered unicast or multicast presentation scheme will be employed in the description of the present invention that is to follow.
In simple cases where there are relatively few nodes to consider, it is possible to simply enumerate all the possible legal combinations of these nodes, and then evaluate each one, choosing the combination having the highest total summed utility among those whose total summed cost is below the cost constraint (such as the anticipated maximum available bandwidth). It is noted that a legal combination is defined as one in which a node is included only if all of its ancestors are also included in the combination.
Taking into account the probability that the data associated with each node arrives at the receiver improves the decision as to which packets to send. The probability is factored into the analysis by computing an expected total utility associated with a candidate combination. In general, this entails identifying all the possible results of sending a candidate combination of packets and computing the probability of each result occurring. Based on the anticipated packet loss rate of the network and the amount of error correction data transmitted, the probability of each of the outcomes can be computed via conventional methods. It is noted that there is a total utility associated with each possible outcome. The total expected utility associated with the candidate combination is then computed by multiplying the total utility associated with each possible outcome by the probability that that outcome will occur, and then summing the products of this calculation.
Many different transmission options are possible, each of which can apply to each of the layers involved in the combination. For example, in addition to being transmitted or not, if a packet is transmitted it can be transmitted immediately or with any prescribed delay. Further, the packet could be transmitted with low or high priority using the network""s quality-of-service mechanism, if any, or with high or low loss protection using an error correction mechanism such as described in the aforementioned co-pending application. In fact these are just a few of the many possible transmission options that are sometimes employed. Each of these transmission options will have a unique utility contribution and transmission cost, as well as a different probability of arriving at the receiver on time. Thus, it would be advantageous to consider the various transmission options for each packet in a candidate combination when determining which combination provides the highest total expected utility within the cost constraint. This is accomplished by not just considering all the possible combinations of packets, but considering all the combinations of packets at each of the various prescribed transmission options.
A process of enumerating all the possible combinations, computing the total expected utility for each, and then selecting the combination having the highest expected utility that does not exceed the anticipated cost constraint, could be employed to optimize the transmission.
Alternately, this problem can be solved by the use of a novel process that can determine the optimum combination for any cost constraint, no matter what form the DAG of the layered scheme takes, whether arrival probability is considered, or whether multiple transmission options exist.
The process involves characterizing the foregoing factors as follows: Find the transmission option xcfx80n for each node n in the graph Gmax, which represents the set of all packets that could be transmitted in a given timeframe, such that the expected utility is maximized subject to a cost constraint. The expected utility is the sum over all packets n of the expected increase "Egr"[xcex94UnΠn less than =nInxe2x80x2], where the product is over all packets nxe2x80x2 on which packet n depends (including itself) (the notation nxe2x80x2 less than =n means that nxe2x80x2 is an ancestor of n in the dependency graph or is equal to n, and Inxe2x80x2 is a random variable equal to 1 if packet nxe2x80x2 arrives on time at its destination and is equal to 0 otherwise. If the packets are transmitted independently, then the expected utility equals xcexa3nxcex94UnΠn less than xe2x80x2=n(1xe2x88x92xcex5(xcfx80nxe2x80x2)). Likewise, the cost is the sum over all packets n of xcex94Cn times the transmission option cost xcfx81(xcfx80n), or xcexa3nxcex94Cnxcfx81(xcfx80n). Thus, the problem is to find the transmission option xcfx80n for each node to maximize xcexa3nxcex94UnΠn less than xe2x80x2=n(1xe2x88x92xcex5(xcfx80nxe2x80x2)) subject to a constraint on xcexa3nxcex94Cnxcfx81(xcfx80n). Notice that this generalizes the original problem, in which there are only two transmission options: xcfx80n=0 (don""t transmit) or xcfx80n=1 (transmit), for which xcex5(0)=1, xcex5(1)=0, xcfx81(0)=0, and xcfx81(1)=1. This being said, the process can be described as follows. To describe the algorithm, let xcfx80=(xcfx801, . . . ,xcfx80N) be the vector of transmission options, where N is the number of nodes in the graph. It is sought to maximize the expected utility U(xcfx80)=xcexa3nxcex94UnΠnxe2x80x2 less than =n(1xe2x88x92xcex5(xcfx80nxe2x80x2)), subject to a constraint on the cost C(xcfx80)=xcexa3nxcex94Cnxcfx81(xcfx80n). Consider the set of points in the Cost-Utility plane, {(C(xcfx80),U(xcfx80))}, where the vector xcfx80 takes on all possible combinations of values. It is sought to find those points (C(xcfx80*),U(xcfx80*)) such that U(xcfx80*) greater than =U(xcfx80) for all points (C(xcfx80),U(xcfx80)) with C(xcfx80) less than =C(xcfx80*). Certainly this is satisfied by points (C(xcfx80*),U(xcfx80*)) on the upper convex hull of the set {(C(xcfx80),U(xcfx80))}. For each point (C(xcfx80*),U(xcfx80*)) on this upper convex hull, there exists a Lagrange multiplier xcex greater than 0 such that Jxcex(xcfx80*) greater than =Jxcex(xcfx80) for all xcfx80, where Jxcex(xcfx80)=U(xcfx80)xe2x88x92xcexC(xcfx80). Conversely, for each Lagrange multiplier xcex greater than 0, the point (C(xcfx80*),U(xcfx80*)) satisfying Jxcex(xcfx80*) greater than =Jxcex(xcfx80) for all xcfx80 lies on the upper convex hull. Thus, by restricting the problem to this upper convex hull, the original problem can be solved, by finding the xcfx80 maximizing the Lagrangian Jxcex(xcfx80)=U(xcfx80)xe2x88x92xcexC(xcfx80)=xcexa3n[xcex94UnΠnxe2x80x2 less than =n(1xe2x88x92xcex5(xcfx80nxe2x80x2))xe2x88x92xcexxcex94Cnxcfx81(xcfx80n)].
The approach to solving this problem is based on the method of alternating variables for multivariate minimization. The objective function Jxcex(xcfx801, . . . ,xcfx80N) is maximized one variable at a time, keeping the other variables constant, until convergence. To be precise, let xcfx80(0) be any initial vector of transmission options and let xcfx80(t)=(xcfx801(t), . . . , xcfx80N(t)) be determined for t=1,2, . . . , as follows. Select one component n(t) in {1, . . . ,N} to optimize at step t. This can be done round-robin style, e.g., n(t)=t mod N.
Then for nxe2x89xa0n(t) let xcfx80n(t)=xcfx80n(txe2x88x921), while for n=n(t), let                                           π            n                                (            t            )                          =                ⁢                              argmax            π                    ⁢                                    J              λ                        ⁡                          (                                                                    π                    1                                                        (                    t                    )                                                  ,                …                ⁢                                  xe2x80x83                                ,                                                      π                                          n                      -                      1                                                                            (                    t                    )                                                  ,                π                ,                                                      π                                          n                      +                      1                                                                            (                    t                    )                                                  ,                …                ⁢                                  xe2x80x83                                ,                                                      π                    N                                                        (                    t                    )                                                              )                                                              =                ⁢                                            argmin              π                        ⁢                                          S                n                                            (                t                )                                      ⁢                          ϵ              ⁡                              (                π                )                                              +                      λΔ            ⁢                          xe2x80x83                        ⁢                          C              n                        ⁢                          ρ              ⁡                              (                π                )                                                                                      =                    ⁢                                                    argmin                π                            ⁢                              ϵ                ⁡                                  (                  π                  )                                                      +                                          λ                xe2x80x2                            ⁢                              ρ                ⁡                                  (                  π                  )                                                                    ,            
where
Sn(t)=xcexa3nxe2x80x2 greater than =n xcex94UnΠnxe2x80x3 less than =nxe2x80x2, nxe2x80x3xe2x89xa0n(1xe2x88x92xcex5(xcfx80nxe2x80x3(t))), and
xcexxe2x80x2=xcexxcex94Cn/Sn(t).
The minimization xcex5(xcfx80)+xcexxe2x80x2xcfx81(xcfx80) over an individual transmission option xcfx80 can be performed by a conventional exhaustive search technique or by some other method.