If you fold a piece of paper in half a mere fifty times, the resulting stack of paper would be as thick as the distance from the Earth to the Sun. While each fold is a simple operation that increases the thickness of the stack by two, the aggregate task is not at all simple. Likewise, many computations conducted by modern computing systems are composite computations that are composed of multiple simple component parts. Each component calculation may be trivial to execute, but the number of components may be astronomically large, resulting in a composite computation that is anything but trivial. Indeed, basic computations that have been handled with ease since the dawn of computing can, taken in the aggregate, result in a composite computation that is effectively intractable for a given application.
The field of machine learning algorithms, and particularly the field of artificial neural networks (ANNs) is held back in large part due to the computational complexity involved with implementing the traditional algorithms used to instantiate an ANN. Assuming the execution of a given ANN used to recognize a word from a sound file takes 10 billion computations, even if each of those component computations could be executed in a microsecond, the composite task would still take over 150 hours to execute. Having speech recognition technology operating at that speed is essentially the same as not having speech recognition technology at all. The reason machine intelligence applications are so resource hungry is that the data structures being operated on are generally very large, and the number of discrete primitive computations that must be executed on each of the data structures are likewise immense. A traditional ANN takes in an input vector, conducts calculations using the input vector and a set of weight vectors, and produces an output vector. Each weight vector in the set of weight vectors is often referred to as a layer of the network, and the output of each layer serves as the input to the next layer. In a traditional network, the layers are fully connected, which requires every element of the input vector to be involved in a calculation with every element of the weight vector. Therefore, the number of calculations involved increases with a power law relationship to the size of each layer.
The latest surge of interest in machine learning algorithms owes its strength most acutely to improvements in the hardware and software used to conduct the composite calculations for the execution of the ANN as opposed to the development of new algorithms. The improvements in hardware and software take various forms. For example, graphical processing units traditionally used to process the vectors used to render polygons for computer graphics have been repurposed in an efficient manner to manipulate the data elements used in machine intelligence processes. As another example, certain classes of hardware have been designed from the ground-up to implement machine intelligence algorithms by using specialized processing elements such as systolic arrays. Further advances have centered around using collections of transistors and memory elements to mimic, directly in hardware, the behavior of neurons in a traditional ANN. There is no question that the field of machine intelligence has benefited greatly from these improvements. However, despite the intense interest directed to these approaches, machine intelligence systems still represent one of the most computationally and energy intensive computing applications of the modern age, and present a field that is ripe for further advances.