Deep learning is a technology used to cluster or classify objects or data. For example, computers cannot distinguish dogs and cats from photographs alone. But a human can easily distinguish those two. To this end, a method called “machine learning” was devised. It is a technique to allow a computer to classify similar things among lots of data inputted into the computer. When a photo of an animal similar to a dog is inputted, the computer will classify it as a dog photo.
There have already been many machine learning algorithms to classify data. For example, a decision tree, a Bayesian network, a support vector machine (SVM), an artificial neural network, etc. have been developed. The deep learning is a descendant of the artificial neural network.
Deep Convolution Neural Networks (Deep CNNs) are the heart of the remarkable development in deep learning. CNNs have already been used in the 90's to solve the problem of character recognition, but their use has become as widespread as it is now thanks to recent research. These deep CNNs won the 2012 ImageNet image classification tournament, crushing other competitors. Then, the convolution neural network became a very useful tool in the field of the machine learning.
Recently, an object detector based on an R-CNN for detection of an object in an image has become popular.
Such the object detector based on the R-CNN is learned using backpropagation by referring to loss values, and its performance depends on its learning result.
However, data sets are hard to come by which includes every object of interest to be detected by the object detector.
As one example, when developing the object detector for six kinds of objects like pedestrians, riders, cars, traffic signs, traffic lights, and animals in an image of a road seen from a vehicle in operation, data sets comprised of training images including all of the six kinds of objects are hard to prepare.
Therefore, a conventional object detector based on the R-CNN, if provided with each of the data sets for each of class groups into which the objects of interest are classified, creates each of R-CNN networks for said each of the data sets and learns the parameters of the R-CNN networks. Herein, the conventional object detector includes the R-CNN networks each of which is separately learned.
That is, by referring to FIG. 1, to detect the six kinds of objects like pedestrians, riders, cars, traffic signs, traffic lights, and animals, as shown in (A), at least one data set for pedestrians, riders, and cars is prepared and used for learning parameters of an R-CNN1, as shown in (B), at least one data set for traffic signs and traffic lights is prepared and used for learning parameters of an R-CNN2, and as shown in (C), at least one data set for animals is prepared and used for learning parameters of an R-CNN3.
Thereafter, the object detector based on the R-CNN is configured to include the R-CNN1, the R-CNN2, and the R-CNN3.
However, this conventional object detector based on the R-CNN includes different multiple deep learning networks corresponding to the number of data sets for learning, therefore, in a real-world test, the conventional object detector shows a problem that execution time increases in proportion to the number of its deep learning networks, compared to a detector including just one deep learning network.