In professional audio, a mixing console, or audio mixer, also called a sound board, mixing desk, or mixer, is an electronic device for combining (also called mixing), routing, and changing the level, timbre and/or dynamics of audio signals. A mixer can mix analog or digital signals or both, depending on the type of mixer. The modified signals (voltages or digital samples) are summed to produce the combined output signals.
Mixing consoles are used in many applications, including recording studios, public address systems, sound reinforcement systems, broadcasting, television, and film post-production. An example of a simple application would be to enable the signals that originated from two separate microphones (each being used by vocalists singing a duet, perhaps) to be heard through one set of speakers simultaneously. When used for live performances, the signal produced by the mixer will usually be sent directly to an amplifier, unless that particular mixer is “powered” or it is being connected to powered speakers.
The output of a mixer is referred to as a mix bus or simply a bus. As used herein, the term “mix bus” refers to an audio signal produced by combining multiple audio source signals in a weighted summation operation, where typically the individual weights applied to each source signal are under user control (for example using the linear faders or knobs of a mixing console). The term “mix matrix” is used to refer to an operation that produces multiple mix busses from a common group of audio source signals. At any instant in time, this operation can be mathematically represented by the matrix equation C=B*A, where C is a vector of N output signal states, A is a vector of M input signal states, and B is an M-by-N rectangular matrix of summing weights, and the * is a matrix multiplication operator. In some cases a mix bus might be a single discrete audio channel, while in other cases it may include more than one audio channel having a common association (for example, a stereo mix bus has two channels left and right, and a surround mix bus has more than two channels corresponding to the surround speaker configuration targeted by the particular mix).
In common practice, the mixing console serves as a central “hub” in the audio system, allowing for all of the audio source signals in a given application to be acquired, treated, combined into various mixes, and then re-distributed outward to monitoring equipment (loudspeakers and headphones) or recording equipment (tape decks or hard disk recorders) or broadcast feeds (satellite uplinks, webcasts, other remote feeds) from a central point in the system. The use of this centralized architecture has been necessary in designing analog mixing consoles, because these devices employ analog circuitry that is physically attached to the various control knobs, switches, faders (rheostats), and LED indicators. In order for a single person to operate all the controls of the system in an ergonomically convenient manner, all of these analog circuits needed to be located underneath, or behind, a common physical control panel. With the penetration of digital technology into mixing console design, some equipment makers have chosen to physically separate the user interface controls from the audio processing hardware elements; however the audio signal flow architecture has remained essentially the same, with the mixer being the center of the audio system.
Audio systems are not always built around one single mixer. In fact, it is common practice to use multiple mixers in a given application to perform sub-mixing. In this model, the mixing (combining) of audio signals occurs in a hierarchical fashion, with groups of signals being pre-mixed in one mixer, and the result of that pre-mix being fed into another mixer where it is combined with other individual signals or other pre-mixes coming from other sub-mixers. In a live concert application, it is common practice to separate the “front-of-house” mixing task from the “on-stage monitoring” mixing task using two separate mixing consoles each having its own operator. In this model, each source signal is split into two feeds (often using a device called a “splitter snake” which performs this function for many sources); one feeding each of the front-of-house and on-stage monitoring mixers. The front-of-house operator creates the audience mix, while the monitor mix operator creates mixes for the performers on stage to hear themselves and their co-performers as clearly as possible.
Despite its continued prevalence over many years, the conventional, centralized mixer approach has some distinct and important disadvantages. A first problem with conventional audio mixing systems is that they do not scale in a natural and easy way. Most users of mixing consoles service a wide range of audio production applications and scenarios, requiring anywhere from one or two channels and a simple mono mix, up to dozens or even hundreds of channels and dozens of separate mixes. Therefore, when purchasing a mixer, it is difficult to determine exactly which size console to buy. Mixing console vendors offer a very wide range of sizes to cover the market space, and buyers must choose something that seems like the right fit, hoping to avoid spending more money or taking up more space than they need to or, on the other hand, hoping to avoid running out of channels or mix busses when they have a large job. Some buyers/users will purchase multiple, different sized mixers to handle different jobs.
A second problem to be solved occurs in networked audio mixing systems, i.e., those that use shared, packet-based networks to interconnect signal input and output (I/O) devices with signal processing devices. These systems typically impose considerable latency in the audio path from signal source to monitor output. This latency—typically on the order of 2 to 10 milliseconds—can negatively impact the experience of, and results achieved by, a performer who is singing or playing an instrument while monitoring himself through the system. The reasons for this increased latency are twofold: first, packet-switched networks have queues and delays within their basic infrastructure, such that signal transport across the network takes an indeterminate amount of time; this mandates a minimum “safety bound,” typically on the order of 1 or 2 milliseconds for optimized networks such as those using IEEE Audio Video Bridging standards (and higher amounts for networks using older technologies), that the receiving side must expect in order to avoid “buffer under-run” conditions that cause audio glitches. Second, conventional systems locate the I/O and signal processing/mixing functions in separate physical units; thus for a singer to hear herself in a monitor mix, her signal must make two trips across the network (from the I/O to the mixer and back to the I/O again). The network transport latency compounds with analog-to-digital and digital-to-analog conversion latency to impose a minimum latency typically of 2 milliseconds, and often much more, along the most critical-latency path.
The importance of minimizing latency for a self-monitoring path can be quantified as follows: Each millisecond of latency imposed on an audio signal corresponds to sound traveling through air a distance 0.34 meters (about 13 inches) at sea level. When a person sings, she hears her vocal chords within a fraction of a millisecond as the vibrations are conducted through bone, body tissue, and immediate surrounding air to her ears. When a person plays an acoustic guitar, he hears the sound from the guitar within about 2 milliseconds, since he is holding the instrument no further than about 2 feet from his head. When a group of people perform together (or even when they have a conversation in the same room), they are typically located a few feet apart, thus they hear each other a few milliseconds later than each person hears his or her own voice or instrument. We therefore conclude that self-monitoring becomes unnatural when the signal path from voice or instrument to ears has a latency greater than about 2 milliseconds. However, monitoring others can seem perfectly natural when the signal path latency is 5 or 10 milliseconds or even more.
A third problem with conventional audio mixing systems, as well as modern network-based mixing systems, is that their use of a centralized mix engine creates an inconvenient topology that hinders the ergonomics and increases cost of system setup and maintenance. The central mix engine needs to be set up, powered, and connected with (typically) large numbers of cables to the various devices at the extremities of the system which are located near actual users. This results in a large number of cables crossing through the stage or room, and a large number of potential failure points in the system.
A fourth problem stems from conventional systems' lack of fault tolerance since they rely on a central mix engine for all the audio processing. If a fault occurs in the central mixer (such as a power supply failure or a main CPU crash) then it is possible for the entire system to become inoperable.