A deep convolutional neural network (CNN) is trained as an N-way classifier to distinguish between N classes of data. CNN classifiers are used to classify images, detect objects, estimate poses, recognize faces, and perform other classification tasks. Typically, the structure of the CNN (e.g., number of layers, types of layers, connectivity between layers, and so on) is selected by the designer, and then the parameters of each layer are determined by training.
Multiple classifiers can be used in combination by averaging. In model averaging, multiple separate models are used. Each model is capable of classifying the full set of the categories and each one is trained independently. The main sources of their prediction differences include different initializations, different subsets of a global training set, and so on. The output of the combined models is the average output of the separate models.