With the development of network technology, users can not only perform searches by keywords but also perform searches by way of images. In the context of searching through images, a user can choose from a gallery of instant photos and upload them to a system. After obtaining an image uploaded by a user, the system may perform analysis of the image, identify objects in the image (people or objects, etc.) to determine type information of the image, and perform a search according to the type information and feature information of the image, and return a search result to the user.
Currently, there are two main ways of identifying items in an image to predict a type of the image. One way is to identify objects in a picture based on full image feature data and to output a classification label. Another way is to use an object detection technology to determine a subject area of an image, to identify object(s) in the subject area, to use a classification label of an object in the subject area as a type of the full image that is recognized. A process of identification based on full image feature data includes extracting visual features of an image to be processed, such as Histogram of Oriented Gradient (HOG) features, Scale-invariant feature transform (SIFT), etc. A system then generates a classification label of the image to be processed through a corresponding classifier such as SVM (Support Vector Machine). A process of identification based on an object detection technology may include the following operations analyzing a full image for subject area to determine the subject area of the full image, extracting corresponding characteristics of the subject area, and determining a classification label of the subject area according to the characteristics of the subject area.
However, the above method for recognizing objects in an image to predict a type of the image has the following problems: (1) For methods based on processing of full image feature data, due to an analysis of a full image, the system inevitably introduces background information, which will interfere with the recognition of a target object in the image, leading to a lower accuracy of a full image classification result. In particular, when the target object occupies a small proportion of the entire area, the introduction of background information has a greater impact on forecasting results. (2) Among processing methods based on a subject area, the system only analyzes a subject area of an image and identifies items in the main area, etc. Since an object area does not usually contain scene information and context information of the image, when A-shaped, a color and other characteristics of a target subject in a subject region are relatively similar, a classification label of the target subject in a subject area cannot be accurately predicted. Moreover, a certain false detection rate exists in an algorithm for detecting a subject area. Methods based on analyzing an area will introduce a loss of this part into type prediction results, and further reduce the accuracy of prediction of a type of object in an image.