Video conferencing devices are known from “Empfehlungen zur Vor-bereitung einer Videokonferenz” [“Recommendations for preparing a video conference”], July 2008, Kompetenzzentrum far Videokonferenzdienste, Technical University Dresden. The ITU-T Standard (Telecommunication Standardization Sector) H.323 for IP transmissions defines audio and video standards for video conferencing systems. Audio standards implemented in video conferencing systems are: G.711, G.722, G.722.1 Annex C (Polycom Siren 14), G.723.1, G.728 and G.729. As video standards, H.261, H.263, H.263+, H.263++ and H.264 are implemented.
The video conferencing terminals that are used are divided into the following four major system classes: personal systems, office systems, group systems, and room systems. Desktop or personal systems are video conferencing systems for personal computers (PCs) and laptops. These software-based solutions are used with a USB camera and a headset (headphone/microphone unit). Moreover, cameras can also be connected through a video card integrated in the PC.
Desktop systems are designed for individual users. In addition to their low cost in comparison to all the other classes, these systems offer the advantage that the user has full access during the video conference to his data and the programs installed on his PC. Compact systems represent fully integrated video communications solutions. Generally, the only additional requirements for operating them are a monitor and the appropriate network connections (integrated services digital network (ISDN) and/or local area network (LAN)). The conference system and camera constitute a closed unit.
Room systems are video communications solutions with a modular design. Flexible system configurations for nearly every application are made possible by adaptable equipment properties. Cameras, room microphones, and large monitors allow these systems to be integrated into even large conference rooms, and these systems naturally also allow for the integration of various peripheral equipment such as, for example, document cameras. Room systems make it possible for mid-sized to large groups of people to participate in video conferences.
The use of convolution in acoustics is known from “Convolution: Faltung in der Studiopraxis” [“Convolution: use in studios”], Philipp Diesenreiter, SAE Vienna 2005. The increasing computing power of special digital signal processors (DSPs) and the home computer permits the use of convolution in sound studios. When one excites a room with a short (broadband) pulse, one hears an echo that is characteristic for this room and that emphasizes or damps specific frequency components of the pulse as a result of the room's geometry and dimensions, its basic structure, its interior, and other specific characteristics. If the echo is now recorded, one thus obtains the impulse response of this room. The impulse response contains the complete characteristic of the (linear) room. In the technique of convolution, this impulse response is now utilized in order to combine any other desired acoustic signals with the impulse response through the mathematical process of convolution. For example, a discrete, fast convolution Fast Fourier Transformation (FFT) for discrete (digitized) periodic signals is used to generate the acoustic characteristic of the room. As an alternative to determining impulse responses for a specific room, the impulse response can also be obtained through modeling, such as ray tracing and the source image model.
When a room is bounded by flat surfaces, the reflected sound components can be calculated by means of the source image method by constructing mirror-image sound sources. By means of the modeling, it is possible to alter the position of the sound source and thus generate a new impulse response. By means of the impulse response, a signal for reproduction is faded out using an associated filter. The spatial impression is the auditory perception that one receives from the room itself when a sound event occurs. The spatial impression augments the acoustic information that comes directly from the sound source with important information about the environment, about the size and character of the room. The spatial impression consists of multiple components: the perception of the width and depth of the room, which is to say of the room size; the perception of liveness, which prolongs each sound event and fuses it with the following one; and the perception of space. Digital filters are one of the most important tools of digital signal processing. One implementation of a filter is achieved using convolution. This type of filter is called a Finite Impulse Response (FIR) filter.
Using digital filters is known from “Image method for efficiently simulating small-room acoustics”, J. B. Allen and D. A. Berkley, J. Acoust. Soc. Am. 65(4), April 1979. Image techniques for simulating on a digital computer the impulse response between two points in a small rectangular room are used theoretically and practically.