Various audio and video system manufactures have attempted to provide a multi-channel networking system of audio and/or video devices, where digital audio can be inserted and extracted at various locations within the network. Typically, such systems have routed digital audio as data in a standard Ethernet switched-packet network. While such approaches take advantage of readily available components, they do not perform adequately for real-time streaming media for a number of reasons.
For example, most switched packet systems require a star topology, where every device is connected to a central “server.” As such, every device requires a separate cable connecting it to the server. This is a sub-optimal configuration, due to cable cost and other considerations, when multiple devices are located in close proximity, but are separated from the server by a great distance.
Additionally, in some cases, it is preferable to have the audio data available at every location within a networking system. Thus, systems have been developed where data flows bi-directionally from an input device. In some cases, it could be desirable to use a star-topology where several devices are connected to a “hub” that is centrally located. In order for traditional devices to support bi-directional transfer of audio data, the routing of this data must be handled at a very high application layer, adding delay, or latency, which is undesirable in real-time performance situations. For example, it is not unusual for professional or high skilled musicians to discern very slight delays in audio processing (as small as 1 ms), which can cause performance problems. Traditional systems that support bi-directional data passing at lower layer revert to being unidirectional when they encounter a standard Ethernet switch.
Another advantage of bi-directional systems is that the user need not be aware of whether a device is “upstream” or “downstream” from another device. As such, all devices appear equal in the system.
Moreover, conventional Ethernet packets can only be routed from one port to another in a contiguous form. Such a packet cannot add new data from additional ports during transmission to update the data packet. As such, existing digital networking systems either become unidirectional when an Ethernet switch is encountered or add considerable latency as the channel data is merged as the application layer and re-inserted into the networking system.
Existing digital systems support a very narrow range of sample rate clocks or require a dedicated hardware clock signal for system synchronization. If analog audio data is introduced and the conversion to digital is synchronous with the network clock, this solution may be sufficient. However, when digital audio is introduced to the system, the data will be asynchronous with respect to an independent network clock. Therefore, the data must be sample-rate-converted to match the network clock. This is undesirable because (i) the sample rate conversion introduces delay to the signal; (ii) sample rate conversion add additional cost to the system and (iii) sample rate converts have to convert the raw audio data to fit the desired timing by mathematically estimating new sample points that lie between the original samples. Accordingly, particularly for a professional audio application, sample rate conversion can result in an undesirable coloration of the input audio data.
A conventional professional audio system may require operation at different sample rates. For example, many systems operate at 48 kHz, 96 kHz, and/or 192 kHz. However, audio CDs are mastered at 44.1 kHz. As such, a system that can only operate at 48 kHz must use sample rate converters to obtain digital contents from audio CDs. Furthermore, systems that perform video post-production (where raw film is transferred to video, edited for audio content, and transferred back to film) must use “SMPTE pull-up” or “SMPTE pull-down” sample rates that can be as far as +/−4% from the original content rate. (This difference is required to accommodate the difference or video frame rates in film (24 frames per second) and video (29.97 frames per second).
In addition, some systems generate an audio sample clock based on the rate of transmitted packets. If timing errors are introduced using this approach, such errors can accumulate in devices that are serially connected in the network. As a result, jitter and wander (low-frequency jitter) may be introduced in the packet rate. Accordingly, jitter and wander can also occur in the audio sample rate, which can cause a digital network system to lose sample “lock,” resulting in a loss of audio data.
In professional audio systems, it can also be necessary to slave the system to a continuously variable digital clock that may move slowly over a range. This is typically the case when the system is slaved to a tape deck that contains a recorded time code. Since the playback of the tape is subject to mechanical variations, the sample rate can fluctuate.
Typical networking systems can be used in setups where powered speaker systems are hung from rafters in an arena or stadium. For such setups DSP algorithms (e.g., for determining crossover frequencies, power output, time alignment between various speaker cones, etc.) might be required to be uploaded to the speaker prior to a performance. It can further be necessary to control DSP parameters in real-time. At the same time, it is often desirable to download telemetry data, such as temperature, impedance and power output, from these remote devices during operation. This non-audio data must be addressable to one, more or all of the devices in a networking system. Ideally, this non-audio data should be on the same wire as the audio data to avoid the extra cost of running wires simply for the command and status reporting requirements.
Some existing audio devices that use the MIDI standard for control employ an electrical connection that is limited to very short distances (50 feet or less) and point-to-point connections. In order to have multiple devices receive the same MIDI stream, a dedicated MIDI splitter or daisy-chained devices (at most 2-3 devices can be daisy-chained before data integrity is comprised) is currently required.
Accordingly, a need exists for networking systems supporting the transfer or audio and/or video data in devices arranged in any topology.
A need exists for networking systems that transmit audio and/or video data bi-directionally or in parallel.
A need exists for networking systems that transmit audio and/or video data using hubs that combine data from multiple inputs.
A need exists for networking systems to be connected in a manner that the connection of devices is not dependent on data flow.
A need exists for networking systems that merge data from different packets arriving on different data streams and output the merged data as a new stream with minimal latency.
A need exists for methods and systems for inserting new digital media into a network without using sample rate converters.
A need exists for networking systems that derive the master clock signal from each device from the network packet rate.
A need exists for audio networking systems that accommodate a broad range of sample rates.
A need exists for networking systems that minimize jitter when multiple network devices are connected in series.
A need exists for networking systems that accurately track a master clock that provide a variable sample rate.
A need exists for audio and/or vide networking systems that permit non-audio/video data to be transmitted.
A further need exists for networking systems that can route performance control data.
A further need exists for networking systems that permit a single MIDI device to be inserted into the audio network that allows control data to be routed up to 500 feet.
A still further need exists for networking systems that permit MIDI data to be read by any device on the network.