In recent years, a large amount of information that is a mixture of good and bad flows through Web pages or electronic bulletin boards on the Internet. For that reason, it has become difficult to determine which information on the Internet can be trusted and which information cannot be trusted.
Transmission information on a proposition such as “Fermented soybeans (fermented soybean) is effective for diet.”, “Green tea prevents cancer.”, “Pluto is a planet.”, or “Tamiflu has a side effect”, for example, can be obtained through the Internet. A lot of affirmative opinions and negative opinions about the transmission information on the proposition as mentioned above are described on Web pages or electronic bulletin boards on the Internet. It is difficult to determine confidence of the transmission information on those propositions such as whether or not the transmission information is correct, just by referring to a part of the opinions.
In order to solve the problem mentioned above, a system is proposed in which opinion information such as reputation information (information indicating whether transmission information on a proposition is true or not) is gathered from the Web and is classified into affirmative opinions and negative opinions, and confidence of the transmission information is evaluated based on the number of the affirmative or negative opinions, an attribute of an information originator, and the like.
An example of an information analysis system of a related art is described in Patent Document 1. In the information analysis system described in Patent Document 1, a personal opinion is extracted from Web pages, bulletin boards, or the like on the Internet. Then, based on referenced degree ranking, and the basis of the opinion or information indicating an identity of a speaker, the confidence measure of the extracted opinion is determined.
Another example of the information analysis system of the related all is described in Non-Patent Document 1. In the information analysis system described in Non-Patent Document 1, the familiarity level (professional level) of an information originator is estimated, based on the occurrence frequency of a keyword in an evaluator's blog (blog). Then, using the familiarity level of the information originator and an affirmation or negation measure, the confidence measure of information is calculated.    Patent Document 1: International Publication No. WO2003/046764 Pamphlet    Non-Patent Document 1: Shinsuke Nakajima et al., “Technology for Reliability Improvement of Web Information Retrieval Based on Blog Analysis”, the Japanese Society for Artificial Intelligence, Sixth Semantic Web and Onthology Study Group, SIG-SW0-A401-05, July 2004