The present disclosure relates to a sound collection and reproduction system, a sound collection and reproduction apparatus, a sound collection and reproduction method, a sound collection and reproduction program, a sound collection system, and a reproduction system. The present disclosure can be applied, for example, in the case where sounds (“sounds” includes audio, sounds or the like) existing within a plurality of areas are respectively collected, and thereafter the sounds of each area are processed and mixed, and stereophonically reproduced.
Along with the development of ICT, the demand has increased for technology which uses video and sound information of a remote location to provide a sensation as if being at the remote location.
In Non-Patent Literature 1, a telework system is proposed which can smoothly take communication with a remote location, by connecting between a plurality of offices existing in separated locations, and mutually transferring video, sounds and various types of sensor information. In this system, a plurality of cameras and a plurality of microphones are arranged in locations within the offices, and video and sound information obtained from the cameras and microphones are transmitted to the other separated offices. A user can freely switch cameras of a remote location, sounds collected by microphones arranged close to a camera can be reproduced each time a camera is switched, and the condition of the remote location can be known in real-time.
Further, in Non-Patent Literature 2, a system is proposed in which a plurality of cameras and microphones are arranged in an array shape within a room, and a user can freely select a viewing and listening position and appreciate content such as a video and audio recorded orchestra performance within this room. In this system, sounds recorded by using microphone arrays are separated for each sound source by an Independent Component Analysis (hereinafter, ICA). While it is usually necessary for sound source separation by an ICA to solve the permutation problem of having the component of each separated sound source replaced and output for each frequency component, in this system, collection and separation is performed for each sound source existing near a position, by grouping the frequency components on the basis of space similarities. While there is the possibility that a plurality of sound sources will be mixed in the sounds after being separated, the influence for finally reproducing all of the sound sources will be small. By estimating position information of the separated sound sources, and performing reproduction by adding a stereophonic sound effect to the sound sources in accordance with a viewing angle of selected cameras, sounds with a sense of presence can be heard by a user.
Non-Patent Literature 1: Masato Nonaka, “An office communication system utilizing multiple videos/sounds/sensors”, Human Interface Society research report collection, Vol. 13, No. 10, 2011.
Non-Patent Literature 2: Kenta Niwa, “Encoding of large microphone array signals for selective listening point audio representation based on blind source separation”, The Institute of Electronics technical research report, EA, Application sounds, 107 (532), 2008.