140 likes | 319 Views
Linguistic Credibility Assessment. Linguistic Credibility Assessment. Emma – general comments on language Matt – tools for linguistic analysis Mary – case study. The Federalist Papers. Series of 85 short essays urged ratification of US Constitution
E N D
Linguistic Credibility Assessment • Emma – general comments on language • Matt – tools for linguistic analysis • Mary – case study
The Federalist Papers • Series of 85 short essays • urged ratification of US Constitution • Pseudonymously published, most were eventually claimed • Alexander Hamilton • James Madison • John Jay • 12 remain of disputed authorship • Presumed to be by Madison or Hamilton
Automated Text Analysis • Fung (2003) • Classification problem, using SVM • Used relative frequency of 70 most common words as features
Classification • Used machine learning to find 3 features to best separate Madison & Hamilton in documents with known author • to, upon, would • Plot the 12 unknown documents along those 3 dimensions
Language • Complex system of communication unique to humanity • used for expressing thoughts • systematic • flexible • allows for infinite combinations • multiple ways to convey the same idea • not completely predictable
Patterns in Language & Language Use • We make use of patterns in language for our purposes of communication • e.g. statement vs. question • Mary sang at the concert. • Did Mary sing at the concert? • Mary sang at the concert? • e.g. Word order in conversation vs. poetry • Soldiers brave were on the march. • This information is used to classify types of language usage • e.g. genre, style, dialect, etc
Similarities & Differences • What are factors that affect how language is used? • language in use (or dialect) • culture, social identity • situation • purpose, topic domain, genre, social relationship between speakers, conversation type, etc • medium • oral: in-person, by phone • written: letter, chat, texts, financial documents • deceptive or truthful
From Theory to Cue • Use theoretical predictions as basis for selecting cues to explore • 5 domains • Arousal: e.g. expect quick rate of speech • Emotion: e.g. (for nervousness) expect more stuttering • Memory: e.g. expect fewer descriptive words • Cognition: e.g. expect less complex sentences • Communication: e.g. less likely to admit forgotten information
Manual Coding Systems • Content-Based Criteria Analysis (CBCA) & Statement Validity Analysis (SVA) • Assumes statements derived from real memories will differ from invented ones in both content and quality • Score statements on the presence or absence of 19 criteria • Reality Monitoring (RM) • Truthful memories are more likely to contain perceptual, contextual, & affective information • Scientific Content Analysis (SCAN) • Used in criminal investigation statements
Some generalized linguistic cues(from DePaulo 2003) • Less forthcoming than truth tellers • Respond less (shorter responses, less elaboration), seem to hold back, speak at slower rate, longer response latency (less if planned) • Tell stories that are less plausible • More discrepancies, less engaging (more repetitions), behavior is less immediate (more indirect, fewer self-references), more uncertain, less fluent (more hesitations, errors, pauses), less active (gestures). Want story to be without error (fewer spontaneous corrections, less likely to admit can’t remember detail) • Make a more negative impression • Less cooperative, use more negative statements, words denoting anger & fear, offensive language, smile less, seem more defensive • Are more tense • Higher pitch, fidget more, pupils dilated for longer periods • BUT! Remember cues are affected by context.
Research aims of CMI • Discovering linguistic cues that are • reliable indicators of deception (or truthfulness) • context-independent as possible • Note: Beware unrealistic claims of accuracy in detection • e.g. Cain’s Innocence“proved” through LVA • Consider the perspective & intentions: • Researcher • User’s understanding • Business’ marketing • Politician’s results-reporting • risk assessment vs. authoritative decision-making