Field
Various systems and methods may benefit from determination of environmental signatures in recordings. For example, such signatures may aid forensic analysis and alignment of media recordings, such as alignment of audio or video recordings.
Description of the Related Art
Advancement in multimedia technologies has given rise to proliferation of such media recording devices as voice recorders, camcorders, digital cameras, and the like. A huge amount of digital information created using such devices can be stored on disks or uploaded on such social media platforms. Metadata describing such important information as the time and the place of recording may be manually added or can be embedded to a media recording using built-in clocks and global positioning system (GPS) in the recording devices. However, digital tools can be used to modify the stored information.
Forensic tools can be used to authenticate multimedia recordings using a signature, known as Electrical Network Frequency (ENF) signal, emanated from power networks. ENF is the supply frequency of electric power in power distribution networks, and its nominal value is 50 or 60 Hz depending on the geographic location. A property of the electric network frequency signal is that its value fluctuates around the nominal value: on the order of approximately 50-100 mHz in the United States. These fluctuations are due to variations in the load on the power grid and generally can be considered as random. Such randomly varying electric network frequency signal can be embedded in multimedia recordings due to the electromagnetic interference from nearby power lines in audio, and invisible flickering of electric powered indoor lightings.
Electric network frequency fluctuations based forensic analysis can thus be used for multimedia authentication tasks as time-of-recording estimation, timestamp verification, and clip insertion/deletion forgery detection. Electric network frequency can fluctuate due to dynamic changes in load demand and power supply, and these fluctuations travel over the power lines with a finite speed.
The electric network frequency signal can be extracted from power signals measured from a power outlet using a step-down transformer and a simple voltage divider circuit. The power signal is divided into time-frames, and frequency estimation algorithms are applied on each frame to determine its dominant frequency, thus estimating the instantaneous electric network frequency signal. The importance of the electric network frequency for multimedia forensics emerges because the electric network frequency can also be present in audio or video recordings due to electromagnetic influences in the place of recording. The electric network frequency variations extracted from a clean power signal match with the electric network frequency variations extracted from an audio signal recorded at the same time and in the same power grid as the power signal.
Digital video cameras have become increasingly popular, thanks to the rapid development of hardware and software technologies. As the amount of video data grows drastically every day, new applications arise for which multiple pieces of audio-visual data need to be analyzed and processed together.
When an event is recorded simultaneously by multiple independent video cameras and possibly from a variety of angles, combining the information in these videos may provide a better presentation and novel experience of the event than each recording alone. For example, a dynamic scene may be reconstructed that allows people to choose from different viewing angles of an event during playback. A video sequence of high space-time resolution can be obtained by combining information from multiple low-resolution video sequences of the same dynamic scene. Synchronization is a fundamental issue to enable these and other applications involving multiple pieces of audio-visual data, namely, the task of temporally aligning video or other multimedia signals.