With the development of Internet technologies and social media, users need to process an increasingly large quantity of pictures. A large number of pictures are uploaded by users on the Internet and social networking software. Most of these pictures are not labeled with content information, for example, information about whether a picture is about a geographic location, a famous scenic spot or a building. As a result, users cannot know the content of these pictures or places where these pictures are taken. Therefore, a method for recognizing these pictures is needed.
In a solution in the existing technology, during picture recognition, first, a feature of a picture to be recognized is extracted, then an image database is searched for several pieces of feature data that are most similar to the feature of the picture to be recognized, and eventually a content label for the picture to be recognized is determined according to related web page text of the similar data. However, this method has the following disadvantages: First, this method depends on web page text data related to a picture to be recognized, but a content label extracted from the text data includes a lot of noise. Second, an image database to be searched is very large and includes a lot of noise, pictures that have similar content but different semantics are likely to be found, and consequently, inaccurate labels are provided eventually. Third, recognized picture content belongs to a wide range and is not precise enough.