The present disclosure relates to a signal processing apparatus, signal processing method, program, electronic device, signal processing system and signal processing method thereof. In particular, the present disclosure relates to a signal processing apparatus, signal processing method, program, electronic device, signal processing system and signal processing method thereof that can generate synchronization information of contents in a robust manner.
In the case of regenerating or editing, in synchronization with time, contents acquired by recording images or sound of the identical event by a plurality of devices, it is requested to find temporal synchronization between the contents. This is because, even in the case of recording images or sound of the identical event, the recording start time varies between the devices and a time delay occurs since the internal clock frequency slightly varies between the devices. Here, the contents denote acoustic data, image data and acoustic data corresponding to image data, and so on.
As a method of generating synchronization information used at the time of synchronizing contents in a temporal manner, there is a method of using time information attached to content files at the time of taking pictures. However, it is not limited that the time information attached to the files at the time of taking pictures is accurate.
Also, as a method of generating synchronization information, there is a method of using the common element of acoustic data included in contents. However, recorded acoustic data includes acoustic data of a wind sound, microphone rubbing sound and other various noise sounds, and therefore there are many cases where the common element is very little.
For example, in a case where only acoustic data recorded in one device includes acoustic data of noise sound or where the acoustic data recorded in each device includes different kinds of noise sounds, the common element is little. Also, although acoustic data recorded in a party location or the like includes acoustic data of BGM (background music) as a common element, since different kinds of conversations are conducted near each device, even in the case of recording the conversations and the BGM in an overlapping manner, the common element becomes little. Especially, in a case where the devices are separated by a distance, the common element is significantly reduced.
Therefore, a method of generating synchronization information in a robust manner with respect to noise sounds is desired.
However, in a method of generating synchronization information using level information of acoustic data, which is disclosed in Japanese Patent Laid-Open No. 2009-10548, it is not possible to generate synchronization information in a robust manner with respect to noise sounds. Also, even in a method of generating synchronization information using a correlation of acoustic data, which is disclosed in Japanese Patent Laid-Open No. 2010-171625, it is not possible to generate synchronization information in a robust manner with respect to noise sounds. Also, in the disclosure of Japanese Patent Laid-Open No. 2010-171625, since a correlation is calculated focusing on only a partial interval of acoustic data, it is not possible to correct a synchronization difference caused over time due to a slight difference of internal clock frequency between devices.
Meanwhile, a method of modeling human's pitch perception and realizing it on a calculator is disclosed in “A unitary model of pitch perception”, J. Acoust. Soc. Am. Volume 102, Issue 3, pp. 1811-1820 (1997), Ray Meddis and Lowel O'Mard.