In recent years, with the development of a communication network technique, an announcement or the like with a voice that has been realized by using a conventional analog signal has been able to be also realized by converting speaker's voice into digital signals and transmitting a digital audio packet obtained by packeting the digital signals on a digital communication network.
When a voice is digitized, an announcement can be freely performed to some of a plurality of divided announce areas without requiring complex wiring, or a plurality of speakers can simultaneously perform an announcement to the same area or different areas.
Furthermore, by using the same receiver, a speech communication can also be performed between a plurality of speakers on the same digital communication network. For example, one person performs an announcement to a target area, and, at the same time, two persons perform a speech communication with each other by using receivers connected to the same digital communication network. At this time, human voices are digitally packeted on transmission sides and multiplexed and transmitted on the digital communication network, only a necessary packet is acquired on a reception side, and the acquired packet is decoded and reproduced to realize a speech communication.
In general, when an announcement is performed to a place where a broadcasted voice cannot be heard because the place is distant from a speaker, or when speaker's voice itself is belatedly returned in conversation between a plurality of speakers or not heard at all, the speaker may receive an uncomfortable feeling when the speaker perform an announcement or conversation. In an announcement, since there is no way of confirming whether a voice is actually output to a target area, the possibility of causing the speaker to feel uneasy cannot be denied.
In contrast to this, when an announcement or a speech communication are realized by using a plurality of analog lines, a receiver or a broadcasting apparatus that receives speaker's voice directly feeds back the received analog voice to the receiver of the speaker, and the fed-back voice is output from the receiver of the speaker. In this manner, the uncomfortable feeling of the speaker is reduced, and it can be confirmed by the speaker that voice reliably reaches the reception side.
However, when digital audio packets formed by a plurality of sound sources are used for various purposes, digital voice processing needs to be performed halfway. In the digital voice processing, digital audio packets obtained for a predetermined period of time are inevitably buffered and subjected to a mixing process or a volume control process after the mixing with another digital audio packet. For this reason, processing delay is essentially inevitable. Due to the delay, a speaker who hears her/his fed-back voice cannot avoid an uncomfortable feeling.
According to Patent Literature 1, there is disclosed a method of multiplexing speaker's voice with another voice to feed back the multiplexed voice to the speaker.
According to Patent Literature 2, there is disclosed a method of selectively feeding back only a voice of a speaker required by a hearer to the hearer in a conference of a plurality of speakers.
However, in both Patent Literature 1 and Patent Literature 2 described above, when the number of voices required by a hearer increases, a CPU (Central Processing Unit) or a DSP (Digital Signal Processor) requires a higher processing capacity, and it cannot be completely avoided that a delay that is enough to cause a person to receive an uncomfortable feeling occurs.