Sentiment analysis typically involves the application of natural language processing and/or text analytics to determine an overall tone of some form of text. In particular, sentiment analysis can be used to gauge the attitude of a speaker or writer of the text. In aspects, the sentiment analysis can classify the polarity of a given text, or whether the text is positive, negative, or neutral. The accuracy of a sentiment analysis can be how well the sentiment analysis agrees with human judgments.
Sentiment analysis on social media from sources such as weblogs, websites, social networking sites, bulletin boards, content aggregators, and other outlets can prove difficult because of low accuracies that result from informal writing prevalent in social media data. Current sentiment analysis tools used on media employ a lexicon-based approach instead of a machine learning approach because the machine learning approach requires the challenge of obtaining enough human-labeled training data for large-scale and diverse social opinion data. In the lexicon-based approach, a sentiment dictionary is used to determine opinion polarity, and can provide useful features for a supervised learning method of the machine learning approach. However, existing sentiment dictionaries do not cover the numerous informal and spoken words used in social media, which can result in low recall. In addition, the existing sentiment dictionaries are not able to update frequently to include newly generated words.
Therefore, it may be desirable to have systems and methods for automatic sentiment dictionary generation. In particular, it may be desirable to have systems and methods using adjective seed words, thesauruses, and conjunction relationships to build sentiment dictionaries and establish polarity scores for words in the sentiment dictionaries.