In an H.265 high efficiency video coding (i.e., HEVC) codec, a block of data is conventionally transformed and quantized prior to an entropy coding. An inverse transform is performed prior to reconstruction and display of the data block. Transform sizes of 4×4, 8×8, 16×16 and 32×32 are included in an H.265 encoding to find which block size leads to a better quality of encoding. The transform operations compute a two-dimensional transform T mathematically represented by formula 1 as follows:T=A·X·AT  (1)where “·” indicates matrix multiplication. The matrix A may be a coefficient matrix and the matrix X may be either a spatial domain residue matrix or a frequency domain transform coefficient matrix. The matrix AT is a transpose of the matrix A. All of the matrices are 4×4, 8×8, 16×16 or 32×32, depending upon the size of the transform.
It would be desirable to implement a two-dimensional transformation with minimum buffering.