Conventionally, a technology of performing rendering by mapping encoded sample data on speakers present on arbitrary positions on the basis of metadata has been proposed as a three-dimensional (3D) audio technology (for example, see Patent Document 1).