1. Field of the Invention
This invention relates generally to the field of workflow based image analysis and classification and more particularly to a classification of images suspected as pornographic in nature or images suspected as being of a copyright nature.
2. Description of Prior Art
A variety of methods have been used in an attempt to detect and categorize objectionable images. Pornographic-free web sites, such as sites targeting families and children have been set up for shielding children from viewing objectionable material. Although a particular site may be pornographic free, and considered acceptable for access by children, it is still possible to gain access to an objectionable web site by starting from an acceptable site. Software applications and Internet services such as Net-Nanny and Cyber-Sitter were created and marketed to help parents prevent their children from accessing objectionable documents by blocking access to specific web sites.
One type of protective software is designed to store the addresses of objectionable web sites, and block access to these sites. Example of prior art are U.S. Pat. No. 5,678,041 to Baker and Grosse, U.S. Pat. No. 6,049,821 to Theriault et. al., and U.S. Pat. No. 6,065,055 to Hughes and Elswick.
Another form of software protection screens the text information accessed by a computer from the network and blocks information sources that are considered objectionable. Examples of such prior art include U.S. Pat. No. 5,832,212 to Cragun & Day, U.S. Pat. No. 5,835,722 to Bradshaw and Shih, U.S. Pat. No. 5,996,011 to Humes, U.S. Pat. No. 6,065,056 to Bradshaw and Shih, and U.S. Pat. No. 6,266,664 to Russell-Falla & Hanson.
Such methods are prone to error as many words have subtle double-meanings which can easily be misinterpreted by such software and other words commonly used in everyday conversation can be easily taken out of context. Further, although such software does have a role to play in content management it does not address the fundamental issue of determining the nature of graphical content on large image collections such as Internet photo communities.
Yet another type of protective software blocks access to URLs except those that are members of a list of manually approved URLs. Examples of prior art include U.S. Pat. No. 5,784,564 to Camaisa et. al. and U.S. Pat. No. 6,286,001 to Walker & Webb.
These approaches are not highly effective because it is a practical impossibility to manually screen all of the images on all of the web sites that are added each day to the web. They rely on either storing a local database of website URLs, or referencing the database on the Internet.
Other approach such as described in U.S. Pat. No. 5,668,897 by Stoflo (Sep. 16, 1997), categorizes images based on a unique image signature into a database for later retrieval and comparison. Such solutions are limited by a known collection of images, which will always be a subset of images created.
Various image-processing algorithms have been investigated for use in detecting objectionable media. For example, algorithms have been tested for use in recognizing shapes, such as people in general, and specific body parts. A detailed summary of work done with algorithms is found in David A. Forsyth and Margaret Flich, Finding Naked People, Journal Reviewing, 1996 and Margaret Flich, David A. Forsyth, Chris Bregler, Finding Naked People, Proceedings of 4th European Conference on Computer Vision, 1996; and David A. Forsyth et al., Finding Pictures of Objects in Large Collections of Images, Proceedings, International Workshop on Object Recognition, Cambridge, 1996. However, all of the above describe individual approaches to analyzing single images using single criteria. None of these publications provide an algorithm/system even close to a robust system, which can be practically used.
Several patents in this field were granted. U.S. Pat. No. 6,148,092 to Qian et al. (Nov. 14, 2000) describes a method of detecting skin-tone and in particular detecting faces, using a luminance chrominance algorithm, which is limited to well defined and full bodies. U.S. Pat. No. 5,638,136 to Kojima et. al. (Jun. 7, 1997) describes yet another method of detecting flesh-tone, and again, this method is limited to well defined chrominance information.
In unrelated fields, Japan patent 09237348A to Hiroshi et. al (Sep. 9, 1997) describes a method of determining the posture of a body. Hiroshi et al. has limited usefulness being again, dependent on color segmentation of an image. U.S. Pat. No. 6,182,081 to Dietl et. al. describes a method for performing an interactive review of the data contents of a computer with a view to the manual screening of objectionable material contained thereon. However, this method is limited to screening text data against a list of objectionable words and collecting all image data in a thumbnail form for manual review. Thus it is not suitable for application to very large collections of images.
In order for an algorithm to be useful for screening objectionable images, it is necessary for the algorithm to achieve a very high ratio of the number of objectionable images correctly identified to the total number of objectionable images in a database. Unfortunately, no algorithm can determine with full accuracy if an image is of pornographic nature or simply an artistic nude, erotic image or an image with a large amount of skin tone but not of any offensive nature. PCT application of USA application WO00/67204 to Papazian et. al. describes the advantage of using a multiple selection of images to increase the overall likelihood, using the fact that the distribution of the likelihood of detection is spread in a Gaussain fashion and the variance is reduced as a function of the samples. However, Papazian et. al. are not utilizing the cross information that one can achieve from a collection of images, but merely using a statistical improvement.
Similarly in the field of copyright detection. The research work and patents applied all relate to different methods and techniques of watermarking images and then detecting watermarked images. Such techniques are described in EPO EP1/126408 to Wen et. al (22/08/2001) describing a method of detecting embedded information in images. U.S. Pat. No. 06,259,801 B1 to Wakasu (Jul. 10, 2001) describes watermarking and detecting of watermarked images using DCT methods. U.S. patent publication U.S.2001/0002931 A1 to Maes describes means of detecting images that were marked using geometrical shapes. The drawback in such an approach is that individual detection of watermarked images does not easily or practically lend itself to any form of automatic or workflow solution.