The subject matter discussed in this section should not be assumed to be prior art merely as a result of its mention in this section. Similarly, a problem mentioned in this section or associated with the subject matter provided as background should not be assumed to have been previously recognized in the prior art. The subject matter in this section merely represents different approaches, which in and of themselves can also correspond to implementations of the claimed technology.
The technology disclosed makes it feasible to utilize convolutional neural networks (CNNs) in big-data scenarios such as medical imaging, where huge amount of data is needed to be processed with limited memory and computational capacity. A major technical problem with existing deep convolution neural networks (CNNs) is the requirement of significant computational resources. The technology disclosed solves this technical problem by adding so-called subnetworks within a 3D deep convolutional neural network architecture (DCNNA), which perform dimensionality reduction operations on 3D data before the 3D data is subjected to computationally expensive operations. Also, the subnetworks convolve the 3D data at multiple scales by subjecting the 3D data to parallel processing by different 3D convolutional layer paths (e.g., 1×1×1 convolution, 3×3×3 convolution, 5×5×5 convolution, 7×7×7 convolution). Such multi-scale operations are computationally cheaper than the traditional CNNs that perform serial convolutions. In addition, performance of the subnetworks is further improved through 3D batch normalization (BN) that normalizes the 3D input fed to the subnetworks, which in turn increases learning rates of the 3D DCNNA.
Machine learning is a field of study within the area of Artificial Intelligence (AI) that gives computers the ability to learn without being explicitly programmed. As opposed to static programming, machine learning uses algorithms trained on some data to make predictions related to that or other data. Deep learning is a form of machine learning that models high level abstractions in data by layers of low level analytics of that data. Recently, CNNs have led to major improvements in image classification and object recognition. By training multiple layers of convolutional filters, The generalization capability of many machine learning tools like support vector machines (SVM), PCA, linear discriminant analysis (LDA), Bayesian interpersonal classifier tend to get saturated quickly as the volume of the training increases. But, CNNs have shown to perform better as compared to traditional machine learning algorithms when trained with large number of diverse images at different times. CNNs are capable of automatically learning complex features for object recognition and achieve superior performance compared to hand-crafted features.
However, CNNs require large amount of training data, without which the network fails to learn and deliver impressive recognition performance. Training such massive data requires huge computational resources, like thousands of CPU cores and/or GPUs, making the application of CNN rather limited and not extendable to mobile and embedded computing. Therefore, CNN architectures that improve the performance of computational resources analyzing big data are required.