As images and videos have become more ubiquitous on the Internet, the need arises for algorithms with the capability to efficiently analyze their semantic content for various applications, including search and summarization. Convolutional neural networks (CNNs) have been shown to be effective tools for performing image recognition, detection, and retrieval. CNNs may be scaled up and configured to support large labeled datasets that are required for the learning process. Under these conditions, CNNs have been found to be successful in learning complex and robust image features.
A CNN is a type of feed-forward artificial neural network where individual neurons are tiled in a manner such that they respond to overlapping regions in a visual field. CNNs are inspired by the behavior of optic nerves in living creatures. CNNs process data with multiple layers of neuron connections to achieve high accuracy in image recognition. Developments in multi-layer CNNs have led to improvement in the accuracy of complex recognition tasks such as large-category image classification, automatic speech recognition, as well as other data classification/recognition tasks.
The limitations in computing power of a single processor have led to the exploration of other computing configurations to meet the demands for supporting CNNs. Among the areas of exploration, CNN accelerators which utilize hardware specialization in the form of general purpose computing on graphics processing units (GPGPUs), multi-core processors, field programmable gate arrays (FPGAs), and application specific integrated circuits (ASICs) have been researched.