Typically, audio scenes are captured using a set of microphones. Each microphone outputs a microphone signal. For an orchestra audio scene, for example, 25 microphones are used. Then, a sound engineer performs a mixing of the 25 microphone output signals into, for example, a standardized format such as a stereo format or a 5.1, 7.1, 7.2 etc., format. In a stereo format, the sound engineer or an automatic mixing process generates two stereo channels. For a 5.1 format, the mixing results in five channels and a subwoofer channel. Analogously, for example for a 7.2 format, the mixing results in seven channels and two subwoofer channels. When the audio scene is to be rendered in a reproduction environment, the mixing result is applied to electro-dynamic loudspeakers. In a stereo reproduction set-up, two loudspeakers exist and the first loudspeaker receives the first stereo channel and the second loudspeaker receives the second stereo channel. In a 7.2 reproduction set-up, seven loudspeakers exist at predetermined locations and two subwoofers. The seven channels are applied to the corresponding loudspeakers and the two subwoofer channels are applied to the corresponding subwoofers.
The usage of a single microphone arrangement on the capturing side and a single loudspeaker arrangement on the reproduction side typically neglect the true nature of the sound sources.
For example, acoustic music instruments and the human voice can be distinguished with respect to the way in which the sound is generated and they can also be distinguished with respect their emitting characteristic.
Trumpets, trombones horns or bugles, for example, have a powerful, strongly directed sound emission. Stated differently, these instruments emit in an advantageous direction and, therefore, have a high directivity.
Violins, cellos, contrabasses, guitars, grand pianos, small pianos, gongs and similar acoustic musical instruments, for example, have a comparatively small directivity or a corresponding small emission quality factor Q. These instruments use so-called acoustic short-circuits when generating sounds. The acoustic short-circuit is generated by a communication of the front side and the backside of the corresponding vibrating area or surface.
Regarding the human voice, a medium emission quality factor exists. The air connection between mouth and nose causes an acoustic short-circuit.
String or bow instruments, xylophones, cymbals and triangles, for example, generate sound energy in a frequency range up to 100 kHz and, additionally, have a low emission directivity or a low emission quality factor. Specifically, the sound of a xylophone and a triangle are clearly identifiable instead of their low sound energy and their low quality factor even within a loud orchestra.
Hence, it becomes clear that the sound generation by the acoustical instruments or other instruments and the human voice is very different from instrument to instrument.
When generating sound energy, air molecules, for example two- and three-atomic gas molecules are stimulated. There are three different mechanisms responsible for the stimulation. Reference is made to German Patent DE 198 19 452 C1. These are summarized in FIG. 7. The first way is the translation. The translation describes the linear movement of the air molecules or atoms with reference to the molecule's center of gravity. The second way of stimulation is the rotation, where the air molecules or atoms rotate around the molecule's center of gravity. The center of gravity is indicated in FIG. 7 at 70. The third mechanism is the vibration mechanism, where the atoms of a molecule move back and forth in the direction to and from the center of gravity of the molecules.
Hence, the sound energy generated by acoustical music instruments and generated by the human voice is composed by an individual mixing ratio of translation, rotation and vibration.
In the straightforward electro acoustic science, the definition of the vector sound intensity only reflects the translation. Unfortunately, however, the complete description of the sound energy, where rotation and vibration are additionally acknowledged, is missing in straightforward electro acoustics.
However, the complete sound intensity is defined as a sum of the intensities stemming from translation, from rotation and vibration.
Furthermore, different sound sources have different sound emission characteristics. The sound emission generated by musical instruments and voices generates a sound field and the field reaches the listener in two ways. The first way is the direct sound, where the direct sound portion of the sound field allows a precise location of the sound source. The further component is the room-like emission. Sound energy emitted in all room directions generates a specific sound of instruments or a group of instruments since this room emission cooperates with the room by reflections, attenuations, etc. A characteristic of all acoustical musical instruments and the human voice is a certain relation between the direct sound portion and the room-like emitted sound portion.