Before any media data can be transmitted using real-time transport protocol (RTP), such data has to be packetized according to certain rules. For example, RFC 2250 describes the rules for MPEG-1 and MPEG-2 data. In order to avoid the repetitive job of file parsing, the data can be packetized once and stored for future use. The QuickTime file format uses “hint” tracks for this purpose.
The QuickTime file format was constructed for local playback, and does not perform well in streaming applications. The QuickTime file format is non-linear, and therefore gathering data to build a single RTP packet requires several seek operations within each file. The time-to-sample, sample-to-chunk, chunk-to-offset, sample-to-size, and hint sample offset tables within the metadata all have to be consulted before the actual media data can be read. These operations lead to a very inefficient use of the system caches. For example, as a caching file grows, various tables within the metadata must be constantly updated. The metadata structures need to be kept in memory, and cannot be saved to the disk until each caching session ends. Such metadata is usually 1-2% of the media data in size, so caching multiple large files can quickly lead to the RAM itself becoming a bottleneck.
The complexity of the QuickTime file format also prevents building lightweight kernel modules for high performance streaming. What is needed is a file format that can be used as a generic container for both streaming and caching applications.