It is common in conferencing systems today to have multiple streams with varying delay requirements traversing the same paths. For example, consider the use case of a video conferencing presentation. In such a case, there may be at least three different media streams: the actual presentation stream (e.g., presentation slides), a video stream (e.g., a webcam feed of the presenter), and an associated audio stream (e.g., the captured voice of the presenter).
Different types of media may have different delay requirements. For example, audio and video streams may have much tighter delay requirements than that of a slide presentation stream. Notably, a conferencing participant may not even notice a slight delay in the presentation stream. However, the video and audio streams may have much tighter delay requirements than the corresponding slide presentation stream. Furthermore, even with the video and audio for the same visual session, the audio data may have a tighter delay constraint than the associated video stream. For example, displaying a video frame slightly late may be imperceptible to the user, while a gap in the audio of a speaker can be highly distracting.