As well-known in the art, the Internet, which is called the “sea of information”, provides various types of information and user convenience enough to be a part of many people's daily life in modern society. Apart from the positive effects of the Internet in the social, economical, and academic aspects, indiscriminate distribution of harmful information using the characteristics of openness, interconnectivity, and anonymity of the Internet has emerged to become serious social problems.
In particular, minors are exposed to harmful information even more frequently than before through the Internet which can be accessed any time. Such an environment misleads minors who still have immature value judgment and feeble self-control, harmfully affecting them emotionally and mentally. Thus, a method for blocking harmful information is required in order to prevent minors or those who do not want to see from being exposed to such harmful information.
Conventional methods for determining a harmful video include a metadata and text information-based blocking technique, a hash and database-based blocking technique, a content-based blocking technique and the like.
The metadata and text information-based blocking technique determines harmfulness of a video by analyzing harmfulness of text included in a title, a file name and descriptions of multimedia. This method has a high over-blocking rate and a high erroneous blocking rate.
The hash and database-based blocking technique calculates hash values of existing harmful videos to make a database thereof, and calculates a hash value of a new input video to compare the calculated hash value with those in the previously established database, thereby determining harmfulness of the input video. This approach is problematic in that the size of the hash value database increases in proportion to the number of harmful videos, and the calculational load necessary for the determination of harmfulness increases in proportion to the length of a video. Also, it is difficult to detect even a known harmful video when the hash value of the known harmful video has been changed by slight modification.
The content-based determination technique which has been recently introduced analyzes contents of a harmful video to extract features thereof, generates a harmfulness classification model from the features, and determines harmfulness of an input video based on the previously generated harmfulness classification model. This method can resolve the high over-blocking rate and high erroneous blocking rate that appear in the metadata and text-based blocking technique, and also resolve the problem of the database size and the calculational load exhibited in the hash and database-based blocking technique.
However, in most of the content-based determination techniques, harmfulness of content of segments (frames, scenes, shots, clips or the like) constituting a harmful video is analyzed and corresponding values are again learned to generate a model so as to be used as a reference for later determination or when an occurrence frequency of harmful elements of a video is greater than a threshold value, the video is determined to be harmful. This approach appears to have high accuracy compared to other existing methods, but since a video segment determination algorithm thereof cannot derive absolutely reliable results, there still exists a problem that over-blocking or erroneous blocking of the entire video occurs.
As an example, it is assumed that a video segment determination algorithm determines that video A composed of 100 segments includes 28 harmful segments, as shown in FIG. 1, and video B composed of 100 different segments includes 20 harmful segments, as shown in FIG. 2.
FIG. 1 is a graph exemplarily showing the results of determination on harmful segments of video A, and FIG. 2 is a graph exemplarily showing the results of determination on harmful segments of video B, which will help understand the conventional methods.
According to the conventional methods, video A will be determined to be more harmful than video B because it has more harmful segments than video B. However, an actual experiment and video analysis show that the above conclusion is not always correct due to an error of the video segment determination algorithm. In particular, as the accuracy of the video segment determination algorithm is low, the determination of harmfulness simply based on occurrence frequencies of the harmful segments has a higher error rate.
However, if considering information on occurrence continuities of harmful segments from the video segment determination results, for example, if considering continuity information 302 of video A and continuity information 402 and 404 of video B, as shown in FIGS. 3 and 4, respectively, it may be determined that video B is more harmful than video A. This analysis can be considered to be accurate in spite of the error of the video segment determination algorithm, when the fact that most harmful scenes (exposure, masturbation, sex, and the like) appear continuously in actual harmful videos is taken into consideration. Here, FIG. 3 is a graph exemplarily showing the determination results on continuity information of harmful segments of video A, and FIG. 4 is a graph exemplarily showing the determination results on continuity information of harmful segments of video B.
Therefore, in order to lower the over-blocking rate and erroneous blocking rate in the method for determining a harmful video based on the video segment determination algorithm including an error, it is required to determine harmfulness of the entire video by utilizing information on the occurrence frequencies, occurrence continuities and the like of harmful segments from the video segment determination results.