1. Field of the Invention
The present invention relates to a method of extracting an experience-revealing sentence from a blog document and a method of classifying activity verbs and state verbs in sentences recorded in a blog document, and more particularly, to a method of classifying sentences of blog text into experience sentences and non-experience sentences using grammatical features such as tense, mood, aspect, modality, experiencer, and verb classes.
2. Discussion of Related Art
Web documents contain various pieces of information such as facts, opinions, and experiences. In particular, experiences play an important role in making decisions or solving problems. Blogs, a kind of web documents, contain abundant user experiences, unlike other web documents such as news articles and homepages.
In the field of information extraction, there are methods of mining user experiences from blogs. These methods are intended to extract attributes such as who, where, when, what, and why from a blog document and structuralize and store an experience using natural language processing technology and machine learning technology.
However, a conventional information extraction method has the following problem. For example, when a sentence “Probably, she will laugh and dance in his funeral” is in a blog document, a structuralized experience “She, Funeral, Laugh and dance” is extracted. In this way, the hypothesis that has not actually happened is extracted as an experience. This is because all text in the blog document has been assumed to be experiences.