The art and popularity of digital video has grown significantly over recent years. Digital video, which represents and stores video signals in a digital format, may provide various benefits over analog video. For example, digital video may provide improved video quality over analog video. As another example, digital video may be easier to store, reproduce, and transport over a network such as the Internet. As yet another example, digital video may be easier to search, edit, and/or analyze. Many other benefits exist as well. Because of these benefits, digital video is becoming the preferred solution for video surveillance.
Most digital video formats use video compression to reduce the quantity of data used to represent captured video content, thus reducing network resources required to transport the video content. Captured video content is basically a three-dimensional array of color pixels. Two dimensions serve as spatial (i.e., horizontal and vertical) directions of the video content, and one dimension represents the time domain. A frame is a set of all pixels that correspond to a single point in time, and is basically equivalent to a still picture. Typically, a large amount of data in a video frame is unnecessary for achieving good perceptual quality. As such, video compression is typically lossy, meaning that at least some data from a video frame is discarded during compression.
Further, captured video content often contains spatial and temporal redundancy from one frame to another. As such, the captured video content may be encoded by registering the similarities and/or differences between frames. In this respect, video compression standards may encode the captured video content using different frame types, known generally as intraframes and interframes, that indicate the encoding of the frames. For example, MPEG standards may encode captured video content using intraframes known as intra-coded frames (I-frames) and interframes known as predictive-coded frames (P-frames) and bidirectionally-predictive-coded frames (B-frames).
Intraframes are simply compressed versions of uncompressed raw frames of the captured video content. As such, intraframes are encoded without reference to other frames in the captured video content, and intraframes typically require more data to encode than other frame types. Often times, intraframes are necessary when differences between frames in video content make it impractical to reference other frames in the video content, such as when significant movement occurs in the captured video content.
Interframes take advantage of the redundancy between frames and basically encode the differences between a current frame and preceding and/or subsequent frames in captured video content. As such, interframes reference other frames in the video content, and require prior decoding of the referenced frames before the interframes can be decoded. Interframes may contain frame data for the captured video content and/or motion vector displacements representing frame differences. Interframes typically require less data to encode than intraframes, because interframes copy data from other encoded frames. In this respect, the size of interframes is typically proportional to an amount of difference between the frames in captured video content.
The different frame types may then he arranged in “groups of pictures” (GOPs), each of which typically begins with an intraframe and ends just before the next intraframe. One common arrangement for a GOP according to an MPEG standard is the fifteen-frame sequence IBBPBBPBBPBBPBB. One or more consecutive GOPs may then form an encoded video stream. As can be readily seen, the amount of data, and thus the data rate, of the encoded video stream will vary depending on the frame types used to encode the captured video content. For example, if the video stream includes a higher rate of intraframes (e.g., if there is frequent movement in the captured video content), the video stream may have a higher data rate. Alternatively, if the video stream includes a lower rate of intraframes (e.g., if there is only infrequent movement in the captured video content), the video stream may have a lower data rate.
As noted above, one benefit of digital video is that it can be transported over a network more efficiently than analog video. For example, digital video may be sent over a network using the well-known TCP/IP protocol. In this respect, video content from multiple remote locations may be aggregated at a central network entity by placing network-enabled video capture devices at the remote locations and then sending the video content from the video capture devices via the network to the central network entity. In one example, the central network entity may he a storage device, such as a network server, that archives the video content. Additionally, the central network entity may be connected to a client station, such as a personal computer, that is arranged to receive and display the video content. In this respect, the video content may be streaming video content that the client station receives and displays in real-time.
The communication links between the remote locations and the network, which carry video content to the network, may take a variety of forms. For example, the communication links may be wireline links, such T1 and/or E1 lines. As another example, the communication links may be wireless links, such as satellite links. Other examples are possible as well.
Unfortunately, these communication links, which may be referred to as “backhaul” links, often suffer from limited capacity. In turn, the limited capacity of the backhaul may limit the QoS provided to video capture devices at a given remote location, thus impacting the quality and/or quantity of video content sent from video capture devices to the network. This is especially a concern with streaming video, which is highly time-sensitive. Further, the limited capacity of backhaul links is often divided equally between video capture devices in a fixed manner, and thus does not efficiently track the varying data rates of video streams produced by the video capture devices. Further yet, the limited capacity of backhaul links is often expensive. Accordingly, there is a need for a system and/or method that efficiently manages the limited capacity of a communication link connected to video capture devices at remote locations.