1. Field of the Invention
The invention disclosed and claimed herein generally pertains to a method and apparatus for assisting users in rating objects of multimedia content, such as images, videos and audio recordings, for objectionable content or subject matter. More particularly, the invention pertains to a method of the above type wherein discrete or individual content items are respectively scored or rated, in order to determine the rating that they should each be given in a rating scheme or structure. Even more particularly, the invention pertains to a method of the above type wherein a specified multimedia object, comprising a number of discrete content items, is moved through a succession of filtering stages, and different semantic procedures are used at different stages to rate respective content items.
2. Description of the Related Art
Rich media, such as text, audio, image and video, are used to freely communicate messages in computer-based communications. As increasing numbers of people across age groups and with diverse cultural backgrounds access on-line digital media objects, there is a growing need to filter sensitive content. For example, parents need tools for managing access of their children to potentially harmful videos, in an environment where what is “harmful” varies in different cultures, but content is available across geographical and cultural boundaries.
Ratings are presently used by the entertainment industry to provide a recommendation system for video content classifications, such as for films, television programs, games and the like. However, this approach to ratings is generally manual, time consuming and inflexible. As TV broadcasting moves toward the Internet Protocol Television (IPTV) model, the boundaries between web content and television content, as well as the boundaries between content created by industry and content created by users, will steadily diminish and ultimately disappear. Moreover, geographical boundaries in content creation and consumption will likewise disappear. That is, videos will be acquired, edited, uploaded and viewed not only locally, but on a global basis as well.
Currently used rating systems are not very adaptable to these anticipated changes. Current technologies protect against access to objectionable websites by using text-based filters and various recommendation systems, and professional video creators have generally been responsible for providing content descriptors that are the basis of the ratings. However, these systems are limited by completeness, in that the manual descriptor-rating schemes remain incomplete and are frequently not enforced. Such systems are also of limited efficiency. It is not possible to have reliable ratings, where very large amounts of data are involved (e.g., all videos on YouTube.com), in an arrangement wherein both the content descriptors and the ratings are provided manually. In addition, the prior art systems are of limited accuracy, since both the description and the ratings are done for the whole video. As a result, it is not possible to guarantee that the ratings are accurate for all segments of the video. Some sensitive content may appear only in the middle of the video clip, and there is no auditing mechanism to check the completeness and accuracy of the descriptors. Finally, it would be desirable for a rating system to be flexible enough to accommodate different international standards, and adjust to the backgrounds and preferences of video consumers on a global basis. Presently available systems do not provide this flexibility. Moreover, currently employed approaches such as human processing do not scale.
While automatic solutions are currently being proposed as alternatives to manual processing, these solutions fall into one of two main approaches. These are (1) duplicate detection and removal, exemplified by U.S. Pat. No. 6,381,601, and (2) low-level image analysis operations like detecting skin color pixels, as exemplified by U.S. Pat. Nos. 6,895,111 and 7,027,645. However, there are a number of drawbacks to these proposed automatic systems: (1) Skin detection and image filtering based on these operations is computationally intensive, and is also error prone with limited accuracy. Moreover, skin detection is best suited for detecting nudity, and does not address other types of sensitive content or objectionability, such as violence, gore or hate. (2) Removing duplicates by matching to known content requires developing and maintaining large databases. It will be impossible to rate new content using a comparison approach, since the system will not contain prior content that will match the new content. (3) Ratings of suitability tend to be based on a very limited assessment of objectionability, which is not related to the semantics of the content. (4) Any rating and filtering schemes that rely on human reviewers are manually intensive, do not scale, and offer a fixed and relatively small number of categories. As an example, the well known rating system of the Motion Picture Association of America (MPPA) is limited to ratings such as G, PG, PG-13 and R.