(1) Field of the Invention
The present invention relates to the field of scanning computer image files for pornographic content for the purpose of controlling the distribution of such image files.
(2) Description of Related Art
Computer networks such as the internet are now used to distribute vast amounts of content. Some of the content is objectionable for a variety of reasons and consequently technology has been developed to control what content is distributed. Systems which perform content control may be implemented in a range of manners at a range of locations in a computer network, for example located in a gateway at a node of a network which controls the passage of various types of object or associated with a browser for displaying web pages.
There are many types of objectionable content, but pornographic content in images is of particular significance, there being in practice vast amounts of pornography distributed over computer networks. In order to control distribution, it is necessary first to scan distributed image files to detect the objectionable content. Detection of pornographic content in images poses particular technical difficulties. It is intrinsically difficult for an automated system to distinguish between images which do and do not contain pornographic content.
Typically, the scanning system analyses the image content of the image file to detect the presence of pornographic image content. A variety of algorithms are used, different algorithms having a different balance between on one hand providing good performance and on the other hand minimising latency and processing requirements.
One type of possible technique uses pixels of an image which represent a flesh-tone as a heuristic indicating a likelihood that an image contains pornography. This is simply because pornographic images frequently contain relatively large amounts of flesh-tone. With such a technique, typically there is performed a heuristic analysis which classifies the image as being pornographic or not using measures of predetermined characteristics of the identified pixels to indicate a likelihood that the identified pixels contain pornographic content or not.
However, regardless of the algorithm used, such analysis consumes significant processing resources due to the need to process the image content which consists of a significant amount of data. This is of particular concern in situations where large numbers of images need to be processed, for example in the scanning of emails or the scanning of web pages during internet browsing. It would be desirable to minimise the processing resources required.
One approach to reducing the processing resources required is by careful selection of the algorithm implemented by the scanning system to analyse the image content. However, in very general terms, algorithms which consume lower amounts of processing resources tend to have lower performance in detecting pornographic content, for example providing a good detection rate and a low false positive rate. Thus to achieve any desired performance, significant processing resources are still required.