1. Technical Field
The present invention relates to object coding and object decoding devices for coding and decoding acoustic signals in a conference system in which a large number of sites and speakers participate.
2. Background Art
Along with the recent development of broadband communication technologies, the IP communication technology has become so common that even for conferences in the general business scene and communication scenes between standard homes, there emerged a telephone conference system based on the IP communication technology and a communication system with a sense of presence in which not only audio signals but also video images are presented at the same time. In addition, the improvement in speed and stability of the IP communications has contributed to devising a conference system with a sense of presence in which a large number of sites and people can participate. As the enhanced convenience increases the use of the conference/communication system with a large number of people and sites, it becomes important to provide a system in which people can participate more easily.
In a conventional video conference system, for a large number of people and sites to participate, a display screen is evenly divided for the number of people or sites. Accordingly, in the case where an extremely large number of people or sites participate, the display screen looks very confusing. Moreover, the participation of a large number of people or sites crosses lines for audio signals of conversations and thereby makes it difficult to identify which person in which site is speaking. In order to solve this problem, it is necessary to start speech by explicitly saying who is going to speak now or to provide an auxiliary tool to display an image which shows who is speaking, thus raising the need for very cumbersome processing.
In addition, an increase in the number of participants in each of sites participating in a conference increases the number of audio/video signals to be coded and decoded in a transmitting unit and a receiving unit in each of the sites, thus leading to an increase in load on the transmitting unit and receiving unit.
In order to solve these problems, it is necessary to use a method in which multiple signals can be coded at the same time and at a low bitrate. Furthermore, a technology for enabling a flexible control on multiple audio signals is also necessary. In this regard, an audio object coding technique (hereinafter referred to as object coding technique) has been proposed and an device has been proposed which has a feature of separately transmitting and receiving at a low bitrate the multiple object signals coded using the above technique (for example, referred to Patent Literature 1). When the coding using the object coding technique is used, down-mixing information of coded M acoustic signals resulting from down-mixing of multiple object signals is transmitted with a small amount of control information and on a receiving side, the information can be reconstructed as N audio object signals (M is smaller than N).