The advent and maturing of Internet technology over the last few decades has totally changed the landscape of the information technology (IT) industry. The absolute success and popularity of (mostly Ethernet based) Internet Protocol (IP) networks has promoted this technology as the prime architectural choice in most IT environments. Central mainframe computers have in most cases been replaced by distributed client-server architectures connected by very powerful IP networks.
This technology is steadily being introduced in other industries as well. Although adoption and above all acceptance of these new technologies was at first occurring rather slowly in the media world, IP based architectures are now fully accepted as the standard solution for file based media production and have drastically changed the way broadcasters operate and function internally.
FIG. 1 illustrates how broadcasters have been operating for long in a sequential video tape based workflow model. Since video tapes were used as physical carrier of the essence material, the person in possession of the video tape had sole and single access to the material. Distributing copies of the same material to other people had to be accomplished by playing out the video tape in real time and record it on another tape recorder(s). In order to perform all tasks in the production workflow, the tasks were executed in a sequential or linear way, each time handing over the result, stored on video tape, to the next workstation. Metadata was typically passed along the video cassette on small paper slips or attached post-its. This led to long lead times in production or dead-lines well before the moment of broadcasting on antenna.
In recent years broadcasters have finally started to embrace Internet technology in their production back-office, leading to a collaborative workflow model. Applying an ICT based infrastructure and IP networks as means of transport, in particular in video/media production, introduces a number of substantial possible benefits, facilitating the fundamental shift from traditional tape-based video manipulation to a file-based production paradigm. This technology leap enables video to be treated, processed, stored and transported as ordinary files independent of the video format, instead of the continuous streams used by the classical media technology of today. Amongst others, the most profound technology changes are:                IP-network based access and transport of the media        Central disk-based media storage        Server-based (non-linear) video editing or processing        Software-based media management and media production systemsTogether with the appearance of standards like MXF (Material eXchange Format) and AAF (Advanced Authoring Format), which provide for a generic file-container for the media essence, these changes lead to the file-based paradigm of media essence. As a practical consequence, this brought some of the major broadcasters to their ‘tape-less’ TV production vision. This idea is further supported by the appearance of camera devices with storage facilities other than the traditional videotapes, e.g. optical disks (Sony) or Solid State memory cards (Panasonic).        
Typically camera crews now enter the facilities with their video stored as ordinary files on memory cards instead of on video tape. The memory cards are put into ingest stations, e.g. ordinary PCs, and the files are transferred as fast as possible, preferably faster than real time, into a central disk based storage system. Once stored in the central system, everybody can access the material simultaneously. Most video operations, such as visioning, video selection, high resolution editing in postproduction, sonorisation, subtitling, annotation and even archiving can be performed in parallel without different people having to wait for each other. This leads in principle to a much more efficient workflow, provided this new workflow and the underlying data flows make optimal use of the benefits and opportunities created by this architecture. Production lead times should become shorter and dead-lines should be closer to the moment of broadcasting on antenna.
A typical example of such a file based infrastructure is schematically depicted in FIG. 2. On the left hand side, the video sources are depicted. These include uploads from tape based old archives, real-time feeds, tape based inputs (video players), file based camera inputs and production servers in studios. On the top, one finds the non-linear, file-based, high resolution software-based editing suites, both local or on a remote site. On the right hand side, several play-out channels are listed, including classical linear broadcasting and internet web farms. A web farm is a group of web servers serving different web sites, whereby each web site is administered by its own webmaster and centralized monitoring provides load balancing and fault tolerance. All these peripheral systems are connected via a central IP network with the central infrastructure, which provides a massive central storage warehouse for different flavours of the media, e.g. high resolution video, low resolution proxy video and audio. Additional central services such as trans-coding and backup/archiving are also included. All media essence is managed by a central Media Asset Management (MAM) system.
An example of a simple workflow may be the ingest of a video clip from a file based camera into the central storage and the selection of the material and transport to the high resolution editing work centre. The workflow consists of essentially 3 steps:                The material is transferred from the memory card of the file-based camera into the central storage system.        A low resolution proxy is created so that any journalist can view the material and select the relevant clips. The journalist creates an editing decision list (EDL) to mark his selection.        The system uses this EDL to transport the selected pieces of material to the non-linear file-based editing suite. There the journalist together with the professional editing technician performs the editing and creates the result again as a media file, or multiple media files.In the earlier tape-based linear era this workflow would also consist of three consecutive steps:        The camera crew comes in with the media material on a video tape and hands the tape over to one of the journalists.        The journalist views the tape to select the relevant shots and notes the time codes.        The journalist takes the original tape to the tape-editing work centre where a professional editing technician spools the tape back and forth to the noted time codes and records the resulting video, together with some effects, back on a final new video tape.Remark that in the linear tape-based workflow, the steps have to be executed one after the other and only one person can use the material at the same time. In the non-linear flow, material can be accessed by multiple persons. As soon as the first frames of the material are being transferred to the central storage, the creation of the first frames of the low resolution version is started and any journalist can start viewing the low resolution proxy. The moment the first selection is made, the corresponding high material can already be transferred to the editing suite, where the final editing can start.        
When translating to an actual data flow, the data transfers required to execute this workflow for each ingested clip give rise to an actual data flow which is clearly much more complex than the simple workflow would suggest. There are several reasons for this unexpected complexity. File-based media solutions first appeared in the separate work centres as small islands, e.g. non-linear editing systems with the size of a small workgroup, each with their own local small IP network, their own storage and some servers and clients. Further, most of the times the original tape-based linear work flow is simply mimicked on the overall file-based architecture by connecting the file-based work centre solutions together with a central IP network and performing the same linear workflow as before, but now using files instead of tapes. Thirdly, since the file-based architecture of the individual work centres has not been optimized to fit efficiently in an integrated overall file-based workflow, a lot of extra file transfers are required. E.g. trans-coding is a separate work centre from the central storage. Hence, format conversion to proxy video requires fetching the video from the central storage, transporting it over the central IP network to the trans-coding engines, trans-coding there locally and transporting the result back again to the central storage. If this trans-coding process consists of several consecutive steps, multiple transfers back and forth over the central IP network are required. Different work centres from different vendors require different patterns of MXF file wrappers. This is the way in which media is packed into a file or multiple files. E.g. the camera stores video and audio each in a separate file. The media asset management system requires the media to be stored in one single file on the central storage. While the editing work centre again only deals with separate video and audio files. Using IT technology like disk-based storage requires additional operations typically not used in a video tape-based classical broadcast environment, such as, mirroring a second copy on a separate storage system, making several back up copies on data-tape, both on-line in a tape-robot, or off-line on a tape shelf. These extra operations have mainly to do with redundancy and/or data protection schemes, such as back up and recovery, which are standard procedure for the IT industry. Lack of trust between media engineers and IT architects results in storing extra copies of the media, to be extra sure not to loose the essence because of failing technology. Most of the time, the mapping of the work flow on the actual data flow and resulting file transfers is done ad hoc and not well documented by the suppliers of the media work centre solutions or the overall integrator.
At peak time, when many of the different workflows are executed at the same time, as many as 500 or more simultaneous file transfers may be launched on the central infrastructure. Evidently, this puts a very high and largely unexpected load on the central IP network. Transfers share the bandwidth on some of the links and server interfaces on the network. The network becomes oversubscribed, with mutual interference of different transfers as a consequence. On top of this, IT traffic is intrinsically different from media traffic. Files are much longer. Traffic is burstier. Consequently, an IP network, such as it is usually designed in a classical IT environment, reacts differently to this new kind of traffic, with unexpected delays and even broken transfers as result. Packet loss is far less acceptable when dealing with media files. Whereas a slow e-mail is still an e-mail, a slow video is no longer a video.
The need for a system to handle adequately the data transfer requirements of multimedia systems, particularly broadcast quality video, was already recognised in patent document U.S. Pat. No. 6,223,211. It proposes a system wherein the problems of latency, flow control and data loss are overcome and data movement within a client system memory in a distributed multimedia system is minimized so as to enable real-time transmission of broadcast quality media data over the network. Latency is reduced by an estimation by the server of client needs. Data loss is prevented and flow control is provided by permitting the server to send only as much information as the network interface can reliably receive. Data movement is minimized by copying data directly from the network interface to memory of input/output devices such as display and audio processors as well as main memory.
In the paper “Packet Spacing: an enabling mechanism for delivering multimedia content in computational grids” (A. C. Feng et al., Journal of Supercomputing, vol. 23, pp. 51-66, 2002) the authors mention they observed significant packet loss even when the offered load was less than half of the available network bandwidth. This behaviour was found to be due to simultaneous bursts of traffic coming from client applications and overflowing the buffer space in the bottleneck router. Metaphorically, this could be viewed as what happens at a major highway interchange during rush hour where everyone wants to go home simultaneously at 5 μm, thus “overflowing” the highway interchange. To avoid such a situation, some people self-regulate themselves by heading home at a different time, i.e., spacing themselves out from other people. Similarly, a solution is proposed wherein packets are spaced out over time. The inter-packet spacing with control feedback enable UDP-based applications to perform well, as the packet loss can be reduced considerably without adversely affecting the delivered throughput.
A similar approach was presented in the paper “Performance Optimization of TCP/IP over 10 Gigabit Ethernet by Precise Instrumentation” (Yoshino et al., ACM/IEEE Conference on High Performance Computing, SC2008, November 2008, Austin, Tex., USA), where long-distance large-bandwidth networks are discussed. The paper deals with the difficulties to be solved before utilizing Long Fat-pipe Networks (LFNs) by using TCP. One of the identified problems is the short-term bursty data transfer. A precise packet analyzer is disclosed that is capable of analyzing the real data transfer over 10 GbE LFNs. In order to avoid packet loss a packet pacing method is used.
European patent document EP1427137 relates to characterising the behaviour of network devices in a packet based network based on in-line measured delay and packet loss. It describes a measurement architecture to obtain per-hop one-way delay between input and output interfaces of the same network device. It does so by generating packet headers with associated timestamps at the mentioned relevant interfaces and internally, within the same device, comparing the timestamps in the related headers of the same packets. A second additional measuring strategy is developed to determine the number of lost packets across a network device and between different consecutive network devices. Packets belonging to the same flow are filtered and/or identified in a trace. Packets in the trace are correlated to detect and count missing packets over a certain time interval.
In WO97/37310 is disclosed a method for monitoring the performance along a virtual connection in an ATM network. In ATM this performance is mainly expressed in terms of acceptable cell delay and acceptable cell loss rate. In the method, a new type of management packets are being inserted additionally into the path of the virtual connection taking the same route as the original actual data packets, in order for these extra management packets to experience the same QoS as the data packets. Information about both packet loss and packet delay is measured at the individual nodes and gathered in the fields of the payload, thereby changing the payload of the management packets. At the end point of the virtual connection, both individual node delay/packet loss and cumulative delay/packet loss can be extracted out of the resulting payload. The intention is to allow in-service monitoring to be used to accurately define the QoS actually seen by users and to detect exact causes of performance deterioration before users are seriously affected.
The paper “Adaptive bandwidth control for efficient aggregate QoS provisioning” (Siripongwutikom et al., Globecom'02, November 2002, pp. 2435-2439) proposes an adaptive bandwidth control algorithm to provide QoS defined by guaranteed aggregated packet loss and optimal bandwidth allocation. Its basic assumption states that static bandwidth allocation fails due to the lack of a detailed characterisation of input aggregate traffic, being specified most of the time roughly only in terms of the long term average rate or peak rate. In the lack of such a characterisation, it describes as an alternative an adaptive fuzzy feedback mechanism that adapts the allocated bandwidth, or service rate, to maintain an average queue length which indirectly achieves the target ‘short term’ loss requirement. It thereby only needs the long-term average arrival rate from the input traffic. However, in order to avoid bandwidth trashing and to keep a stable feedback control mechanism, the solution is limited to control timescales in the order of 1 second or longer.
The paper “Controlling Short-Term Packet Loss Ratios Using an Adaptive Pushout Scheme” (Nananukul et al.; Proceedings of IEEE conf 2000 Heidelberg Germany, June, pp. 49-54) describes a theoretical model to characterize a packet loss process accurately. It operates under the assumption that QoS requirements can be classified into packet delay requirements and packet loss requirements. It proposes to include short-term packet loss ratios and maximum packet loss ratio on top of long-term packet loss ratio to define a deterministic loss model. It then proposes an adaptive pushout scheme to selectively drop packets at a single network node which allows this single network node to monitor and control the packet loss process at this network node so that the flow locally conforms to the loss model. Simulations were performed at very low service rates (64 kbps) to validate the results.