A video signal typically has an enormous amount of information. Thus, a video signal is typically compression-encoded before being transmitted or stored. In order to encode a video signal with high efficiency, pictures whose unit is a frame are divided into a plurality of blocks in units of a predetermined number of pixels. Orthogonal transform is performed for each block to separate the spatial frequency of a picture into frequency components. Each frequency component is obtained as a transform coefficient and encoded.