Detecting Deception Through Linguistic Analysis

Detecting Deception Through Linguistic Analysis Judee K. Burgoon J. P. Blair Tiantian Qin Jay F. Nunamaker, Jr.

Introduction • Intelligence analysts are required to sift through mountains of information • Humans are generally bad at detecting deception • Mounting evidence suggests that CMC makes humans even less accurate at detecting deception • There is a need to develop tools to help humans detect deception

Background • Desert Survival • 2 (Deception – Truthful) X 2 (FtF or CMC) • All discussions were transcribed • Linguistic Analysis conducted on 27 possible indicators of deception • Mock Theft Pilot • 2 (Deception – Truthful) X 2 (FtF or Text) • Transcribed and Analyzed

Hypotheses • Deceivers will display higher • Quantity • Expressiveness

Hypotheses • Deceivers will display less • Complexity • Vocabulary • Grammatical

Method • Students recruited from a multi-sectioned communications class • Half were randomly assigned to be thieves and half to be innocents • Thieves “stole” a wallet that was left in their classroom • Innocents were told that a “theft” would occur on a given day

Method • Subjects were motivated to do well by telling them that they could earn $10 if they convinced the interviewer that they were innocent • An additional $50 was to be awarded to the person who was the most successful at convincing the interviewer.

Method • Participants were interviewed by trained interviewers using a standardized BAI format in one of three modalities • FtF • Text/Chat • Audio Conference • All interviews were recorded and transcribed

Method • Analysis was conducted using shallow parsers (Grok and Iskim) or look-up dictionaries • Classes of Cues • Quantity (Words, Syllables, Sentences) • Vocabulary Complexity (Big words and Syllables per word) • Grammatical Complexity (Short and Long Sentences, Flesch-Kincaid, and others) • Expressiveness (Rate of Adj and Adv, emotiveness index, affective terms)

MOCK THEFT RESULTS Decision Tree Analysis Note: Sample Tree from text modality with no duplicated cues

MOCK THEFT RESULTS Decision Tree Analysis Note: Sample tree from text modality, significant cues only

78.58% 75% 62.5% 60.42% 58.33% Original Significant Only txt;Significant No Duplicate Only txt; Noduplicate DECISION TREE IMPROVEMENT

Conclusions • We were able to identify some linguistic indicators of deception • Modality also appears to affect several indicators • These indicators could be subjected to a pruned tree algorithm to classify subjects as truthful or deceptive • Future research will serve to further improve modeling

Future Research • The linguistic model will be improved by • Adding more data • Improving dictionaries • Focusing on different models for different communication contexts • Adding subjective operator evaluations

Detecting Deception Through Linguistic Analysis