Conventionally, as a technology for controlling localization of a sound image using a plurality of speakers, VBAP (Vector Base Amplitude Panning) is known (for example, refer to NPL 1).
In the VBAP, by outputting sound from three speakers, a sound image can be localized at one arbitrary point at the inner side of a triangle defined by the three speakers.
However, it is considered that, in the real world, a sound image is localized not at one point but is localized in a partial space having a certain degree of extent. For example, it is considered that, while human voice is generated from the vocal cords, vibration of the voice is propagated to the face, the body and so forth, and as a result, the voice is emitted from a partial space that is the entire human body.
As a technology for localizing sound in such a partial space as described above, namely, as a technology for extending a sound image, MDAP (Multiple Direction Amplitude Panning) is generally known (for example, refer to NPL 2). Further, the MDAP is used also in a rendering processing unit of the MPEG-H 3D (Moving Picture Experts Group-High Quality Three-Dimensional) Audio standard (for example, refer to NPL 3).