This invention relates to the application of Natural Language Processing (NLP) to the detection of deception in written texts.
The critical assumption of all deception detection methods is that people who deceive undergo measurable changes—either physiological or behavioral. Language-based deception detection methods focus on behavioral factors. They have typically been investigated by research psychologists and law enforcement professionals working in an area described as “statement analysis” or “forensic statement analysis”. The development of statement analysis techniques has taken place with little or no input from established language and speech technology communities.
The goal of these efforts has been twofold. Research projects, primarily conducted by experimental psychologists and management information systems groups, investigate the performance of human subjects in detecting deception in spoken and written accounts of a made up incident. Commercial and government (law enforcement) efforts are aimed at providing a technique that can be used to evaluate written and spoken statements by people suspected of involvement in a crime. In both cases, investigators look at a mix of factors, e.g. factual content, emotional state of the subject, pronoun use, extent of descriptive detail, coherence. Only some of these are linguistic. To date, the linguistic analysis of these approaches depends on overly simple language description and lacks sufficient formal detail to be automated—application of the proposed techniques depends largely on human judgment as to whether a particular linguistic feature is present or not. Moreover none of the proposed approaches bases its claims on examination of large text or speech corpora.
Two tests for measuring physiological changes are commercially available—polygraphs and computer voice stress analysis. Polygraph technology is the best established and most widely used. In most cases, the polygraph is used to measure hand sweating, blood pressure and respiratory rate in response to Yes/No questions posed by a polygraph expert. The technology is not appropriate for freely generated speech. Fluctuations in response are associated with emotional discomfort that may be caused by telling a lie. Polygraph testing is widely used in national security and law enforcement agencies but barred from many applications in the United States, including court evidence and pre-employment screening. Computer voice stress analysis (CVSA) measures fundamental frequency (FO) and amplitude values. It does not rely on Yes/No questions but can be used for the analysis of any utterance. The technology has been commercialized and several PC-based products are available. Two of the better known CVSA devices are the Diogenes Group's “Lantern” system and the Trustech “Vericator”. CVSA devices have been adopted by some law enforcement agencies in an effort to use a technology that is less costly than polygraphs as well as having fewer detractors. Nonetheless, these devices do not seem to perform as well as polygraphs. The article Investigation and Evaluation of Voice Stress Analysis Technology (D. Haddad, S. Walter, R. Ratley and M. Smith, National Institute of Justice Final Report, Doc. #193832 (2002)) provides an evaluation of the two CVSA systems described above. The study cautions that even a slight degradation in recording quality can affect performance adversely. The experimental evidence presented indicates that the two CVSA products can successfully detect and measure stress but it is unclear as to whether the stress is related to deception. Hence their reliability for deception detection is still unproven.
Current commercial systems for detection of deceptive language require an individual to undergo extensive specialized training. They require special audio equipment and their application is labor-intensive. Automated systems that can identify and interpret deception cues are not commercially available.