Specialized documents, such as research reports, have unique structure and characteristics that make it difficult to extract meaningful sentiment or position signal using natural language processing. The corresponding vocabulary of a research report may be subtle and sometimes may not match typical sentiment keywords. For example, the phrase “buyback” may have a sentiment signal associated to it, although none of the keywords have any sentiment. This differs from typical sentiment extraction done with social media that have obvious sentiment keywords. Standard sentiment libraries are more appropriate for social media, since these standard libraries include basic sentiment expressions like love, hate, dislike, despise, adore, etc. Such expressions are not found in research reports.
Furthermore, in many research reports, vocabulary and sentiment signals may be analyst, geography, or segment specific.
These and other deficiencies exist.