1. Field of the Invention
The present invention relates generally to methods of detecting pornographic images transmitted through a communications network, and more particularly to a detection method wherein pixels of a questionable image are compared with a color reference database and an area surrounding a questionable image is subjected to a texture analysis, and images with questionable areas are subjected to a shape analysis.
2. Description of the Prior Art
A variety of methods have been used to deter the display of xe2x80x9cobjectionablexe2x80x9d images in a work site. xe2x80x9cPornographic-freexe2x80x9d web sites, such as sites targeting families and children have been set up for shielding children from viewing objectionable material. Although a particular site may be pornographic free, and considered acceptable for access by children, it is still possible to gain access to an objectionable web site by starting from an acceptable site. Software applications and Internet services such as Net-Nanny and Cyber-Sitter were created and marketed to help parents prevent their children from accessing objectionable documents by blocking access to specific web sites. One type of protective software is designed to store the addresses of objectionable web sites, and block access to these sites. Another type of protective software blocks access to all xe2x80x9cunapprovedxe2x80x9d sites from within a limited selection of sites. These approaches are not highly effective because it is a practical impossibility to manually screen all of the images on all of the web sites that are added each day to the web. They rely on either storing a local database of website URLs, or referencing the database on the Internet. Many next-generation Internet terminals for the consumer market have limited local storage capability and cannot store the database locally. Where the database is referenced on the Internet, there are two disadvantages: (i) the database must be referenced before each Web page is displayed, causing a significant delay to the display of web pages on a browser and (ii) there is a significant increase in the network bandwidth used by such an Internet terminal because of these database lookups. Various algorithms have been investigated for use in detecting objectionable media. For example, algorithms have been tested for use in recognizing shapes, such as people in general, and specific body parts. A detailed summary of work done with algorithms is found in David A. Forsyth and Margaret Flich, Finding Naked People, Journal Reviewing, 1996 and Margaret Flich, David A. Forsyth, Chris Bregler, Finding Naked People, Proceedings of 4th European Conference on Computer Vision, 1996; and David A. Forsyth et al., Finding Pictures of Objects in Large Collections of Images, Proceedings, International Workshop on Object Recognition, Cambridge, 1996.
In order for an algorithm to be useful for screening objectionable images, it is necessary for the algorithm to achieve a very high ratio of the number of objectionable images correctly identified to the total number of objectionable images in a database. This ratio will be referred to as the xe2x80x9crecallxe2x80x9d, or otherwise referred to as positive identification. In addition, in order for a system to be useful, it should not mis-classify non-objectionable images and therefore generate what is referred to as xe2x80x9cprecisionxe2x80x9d or xe2x80x9cfalse-alarmxe2x80x9d.
A perfect system will have full positive identification (100% of images that are suspicious will be flagged) and 100% precision (no images that are not objectionable will be flagged). Of course, no system can be perfect. It is therefore a balancing act to try and maximize the positive identification while not over loading the system with false alarms. However, it is important to note that when only a small fraction of the images are objectionable, it is highly important to maximize positive identification, even if the false alarm percentage increases.
One algorithm system reported by Forsyth had a 43% recall with a 57% precision. According to this report it took about 6 minutes of analysis per image to determine if an image pre-selected by a skin filter was an image of a person. In perspective, for a web site that handles 100,000 images per day, such percentages may mean that many images may not be detected and therefore, the algorithm may not be useful.
It is an object of the present invention to provide an accurate and computationally efficient method of detecting images that may contain pornographic material.
It is a further object of the present invention to provide an accurate and efficient method of detecting images that contain faces for facial recognition purposes.
Briefly, a preferred embodiment of the present invention includes a method of detecting pornographic images, wherein a color reference database is prepared in Luminance-Chrominance space such as the L*a*b* color space, defining a plurality of colors representing relevant portions of a human body. A questionable image is selected, and sampled pixels are compared with the color reference database. The surrounding areas having a matching pixel within accepted variability are subjected to a texture analysis to determine if the pixel is an isolated color or if other comparable pixels surround it; a condition indicating possible skin. If an area of possible skin is found, the questionable image is classified as objectionable. A further embodiment includes preparation of a questionable image reference shape database defining objectionable shapes. An image with a detected area of possible skin is compared with the shape database, and depending on the results of the shape analysis, a predefined percentage of the images are classified for manual review.
An advantage of the present invention is that it provides a more accurate method of detecting pornographic images.
An advantage of the present invention is that it provides a practical method for detecting, classifying and marking images which are suspected as indecent.
A further advantage is that due to its computational efficiency, the invention can screen large volumes of images at speeds close to or equal to real time and block them from being viewed immediately.
A still further advantage of the present invention is that due to a multiple detection criteria, the system can reach a low false negative rate.
Another advantage of the present invention is that due to its ability to not only add images but eliminate images the system has a low rate of false positive detection.
Another advantage of the present invention is that it provides a method offering greater speed of detection of pornographic images.
A still further advantage of the method of the present invention is that it can be implemented on the client side as with an extension of a browser application.