Deep Convolution Neural Networks, or Deep CNN is the most core of the remarkable development in the field of Deep Learning. Though the CNN has been employed to solve character recognition problems in 1990s, it is not until recently that the CNN has become widespread in Machine Learning. For example, in 2012, the CNN significantly outperformed its competitors in an annual software contest, the ImageNet Large Scale Visual Recognition Challenge, and won the contest. After that, the CNN has become a very useful tool in the field of the machine learning.
However, there was a prejudicial perception that 32-bit floating point operations are needed for the deep learning algorithms, so that mobile devices may be considered as not being capable of performing programs including the deep learning algorithms.
However, by some experiments, it was proved that 10-bit fixed point operations, which require less computing power than the 32-bit floating point operations, are sufficient for the deep learning algorithms. Thus, there were many attempts to provide methods for using the 10-bit fixed point operations for the deep learning algorithms in devices with limited resources, i.e, the mobile devices. Among the methods, a method named “Dynamic Fixed Point”, suggested by a thesis named “Caffe-Ristretto”, became widespread. The “Dynamic Fixed Point” method is distinctive from other methods in that each of transitional FL values may be applied to each of parts included in a CNN. Herein, the FL value is a parameter corresponding to a size of an LSB included in quantized values, and the LSB is a bit position in a binary number having the smallest value, i.e, a unit value. Owing to the transitional FL values, different FL values can be applied to different channels during processes of the quantization, which approximates floating point values to fixed point values, so that quantization errors could be reduced. In the “Dynamic Fixed Point” method, quantization errors of the largest value among the original floating point values are referred to determine the FL value.
However, the process of determining the FL value proposed by the conventional “Dynamic Fixed Point” method have a critical shortcoming. In a neural network, the original floating point values, including values of parameters or feature maps do not follow a certain distribution. Rather, the values are irregularly distributed, most of them small and very few of them large. Thus, if the FL value is determined by referring to quantization errors of the largest value among the original floating point values, quantization errors of values which are relatively smaller than the largest value may be too large.
FIG. 4 shows each of variations of quantized values included in each of channels according to a conventional method.
By referring to FIG. 4, it may be seen that a difference between a variation of a first channel and a variation of a second channel is very large. It is because the FL value for the quantization is determined by referring to the quantization errors of the biggest value, so that small values are not quantized properly. This is a disadvantage because the large difference of the variations among the channels may cause distortion on output values.