The deep learning algorithm is the core of artificial intelligence, and has a great driving effect on the development in various fields (for example, voice recognition, image recognition, and natural language processing). The deep learning algorithm is a typical calculation-intensive algorithm, and generally has a computing complexity of O(N3) (cubic order), which is usually one or two orders of magnitude higher than that of a conventional machine learning algorithm. On the other hand, the deep learning algorithm is often closely associated with large data. Generally, terabyte to petabyte order of training data and hundreds of millions to hundreds of billions of training parameters are required in order to obtain a model with sufficient precision. Combining those two points in practical applications, the deep learning algorithm has very high requirements on the amount of computation that a conventional CPU (Central Processing Unit) cannot meet. To solve the computing bottleneck of the deep learning algorithm, many companies have designed special chips for the deep learning algorithm, for example, Baidu's artificial intelligence computer, and Google's TPU (Tensor Processing Unit).
As shown in FIG. 1, a network structure of an deep learning algorithm generally has N layers of networks, wherein the N layers may range from several layers to tens of layers. Each layer may be a DNN (Deep Neural Network), RNN (Recurrent Neural Networks), or CNN (Convolutional Neural Network) structure. There are activation functions between the layers. There are more than ten types of commonly used activation functions. The activation functions between the layers may be identical or different. In the prior art, various activation functions are usually calculated in the following two approaches: 1. By means of software programming using a general-purpose processor. This approach is inefficient, because the speed of such complex computation as processing activation functions by the general-purpose processor is relatively low. 2. Using a special hardware circuit. Implementing activation functions using a special circuit requires high costs. On the one hand, the activation functions are complex, and each function needs to consume considerable circuit resources. On the other hand, to support multiple types of activation functions, the total consumption of circuit resources is great. In addition, the special circuit structure is not flexible, and cannot flexibly support new activation functions.