Video recording on electronic apparatus is now common. Devices ranging from professional video capture equipment, consumer grade camcorders and digital cameras to mobile phones and even simple devices as webcams can be used for electronic acquisition of motion pictures, in other words recording video images. As recording video has become a standard feature on many mobile devices the technical quality of such equipment and the video they capture has rapidly improved. Recording personal experiences using a mobile device is quickly becoming an increasingly important use for mobile devices such as mobile phones and other user equipment. Combining this with the emergence of social media and news ways to efficiently share content underlies the importance of this field and the new opportunities it offers for the electronic device industry.
Recording from mobile devices however produces overall quality levels which are limited in comparison to professional created content. One key issue is the time the average consumer is able and willing to spend in processing or editing the recorded content, which in many cases is close to zero. This is particularly true with content which is shared with others quickly after capture in social network applications. An aspect of improving the technical quality of the recorded content can therefore be to assist the user to get the recording correct first time and minimising the need for editing the video or post processing the video.
Many devices now contain multi-microphone or spatial audio capture (SPAC) components where using at least three microphones on a device or connected to the device the acoustic signals surrounding the device are recorded. These spatial audio capture devices make it possible to detect the direction of audio signal components in other words provide a spatial domain to any frequency and time domain and furthermore enable the production of multichannel signals such as 5.1 channel audio signal on a mobile device.
The perception of the direction of video and audio signals differs. While the human visual system gathers information from a fairly large angle which is processed by the brain and quickly adapts to focussing on different parts of the area, it is not possible to see what is behind the viewer without turning the head. The human auditory system on the other hand is able to pick up clues of the whole environment or space within which it operates and the listener is able to hear sounds from behind the head. Electronic devices record video which is defined by the angle of view of the lens while audio recording typically includes all of the sounds around the recording apparatus defined by the microphone or microphone array coverage. In other words the video camera records everything that is-in-line-of-sight and in the visible spectrum as defined for the recording device. An object which appears between the lens of the camera and the intended subject can block the subject. However while a sound close to the microphone may mask sounds from further away the distance between the sound source and the microphone and the volume of the sound source are important as distant sounds can mask closer sounds depending on their loudness and timing. The effect of blocking or masking can thus be different for video and audio content.
Furthermore in recordings made by consumer devices the user can control the video part fairly easily for example to avoid an object blocking the subject by moving the camera and using a zoom lens. However such devices typically do not enable the user to have refined control over the capturing of acoustic signals and the generation of audio signals.
Although there are directional microphones (for example shotgun microphones) which can be used to improve the directional separation by damping environmental sounds, this is not necessarily a desired effect for a complete video sequence. On the other hand post-processing of the audio signal requires additional time and technical expertise which most consumers will lack in order to achieve better quality audio signal output