(1) Field of the Invention
The present invention relates generally to the field of controlling content distributed over a computer network. It relates specifically to content in the form of computer images and to scanning such images for pornography.
(2) Description of Related Art
Computer networks such as the internet are now used to distribute vast amounts of content. Some of the content is objectionable for a variety of reasons and consequently technology has been developed to control what content is distributed. Systems which perform content control may be implemented in a range of manners at a range of locations in a computer network, for example located in a gateway at a node of a network which controls the passage of various types of object or associated with a browser for displaying web pages.
There are many types of objectionable content, but pornographic content is of particular significance, there being in practice vast amounts of pornography distributed over computer networks. In order to control distribution, it is necessary first scan distributed content to detect the objectionable content. Detection of pornographic content in images poses particular technical difficulties. It is intrinsically difficult for an automated system to distinguish between images which do and do not contain pornographic content.
Such scanning of images faces competing requirements. One requirement is that the scanning is robust and accurate. There must be good performance in detecting pornographic content, for example providing a good detection rate and a low false positive rate. However there are practical limitations which tend to compete with performance. One such limitation is latency. In many situations, such as the scanning of web pages, it is desired to provide a low latency. Another such limitation, although in some situations of less significance than latency, is the cost of resources (e.g. memory, processing power) of performing the scanning. Such practical limitations tend to reduce the availability of complicated analysis techniques which might theoretically provide good performance.
One type of possible technique uses pixels of an image which represent a flesh-tone as a heuristic indicating a likelihood that an image contains pornography. This is simply because pornographic images frequently contain relatively large amounts of flesh-tone. With such a technique, typically there is performed a heuristic analysis which classifies the image as being pornographic or not using measures of predetermined characteristics of the identified pixels to indicate a likelihood that the identified pixels contain pornographic content or not.
However, the performance of any such heuristic analysis is limited. Such heuristic analysis, by its very nature, is not totally accurate and can incorrectly classify images. For example, considering solely the heuristic of flesh-tone, some images containing pornographic content may contain relatively small amounts of flesh-tone and, vice versa, some images not containing pornographic content may contain relatively large amounts of flesh-tone, tending to lead to mis-classification of such images.
The present invention is concerned with techniques which improve the performance of such heuristic analysis.