OLD 3-STIMULUS, P300-BASED CIT (GKT)

The New Complex Trial Protocol for Deception Detection with P300: Mock Crime Scenario and Enhancements J. Peter Rosenfeld, John Meixner, Michael Winograd, Elena Labkovsky, Alex Sokolovsky, Xiaoxing Hu,Alex Haynes, Northwestern University

OLD 3-STIMULUS, P300-BASED CIT (GKT) PROBE: GUILTY KNOWLEDGE ITEM: $5000 Press non-target button. IRRELEVANT: OTHER AMOUNT: $200 Press non-target button. TARGET: OTHER AMOUNT: $3000 Press target button.

Previous P300 DD protocols used Separate Probe(P),Irrelevant(I) and Target(T) trials. • 80% to 95% correct detection rates….but…. *Rosenfeld et al. (2004) and Mertens, Allen et al. (2008):These methods are vulnerable to Counter-measures (CMs) via turning I’s into covert T’s.

Old P300 Protocol • 1 of 3 Stimuli on each trial: Probe (P), or Irrelevant(I), or Target (T). Subject presses either Target or Non-Target (NT) button. Both P and I can be Non-Targets. Special I is defined T. • This leads to 2 tasks for each stimulus: • 1. implicit probe recognition vs. • 2. explicit Target/Non-Target discrimination Possible Result: Mutual Interference more task demand  reduced P300 to P. CMs hurt Old test. A CM is an attempt to defeat the test by converting irrelevants into covert targets

How to do CMs: • When you see a specific irrelevant, SECRETLY make some response, mental/physical. • After all, if you can make special response to TARGET on instruction from operator, you can secretly instruct yourself. • Irrelevant becomes secret target. It makes big P300. If P = I, no diagnosis.

Results from Rosenfeld et al. (2004): Farwell-Donchin paradigm (BAD and BCAD are 2 analysis methods.) Diagnoses of Guilty Amplitude Difference (BAD) method,p=.1 Innocent Group Guilty Group CM Group 1/11(9%) 9/11(82%) 2/11(18%) Cross-Correlation(BC-AD) Method, p=.1 0/11(0%) 6/11(54%) 6/11(54%)

Results (hit rates) from Rosenfeld et al. (2004): Rosenfeld paradigm WeekBAD*BC-AD* 1: no CM 12/13(.92) 9/13(.69) 2: CM 6/12(.50) 3/12(.25) 3: no CM 7/12(.58) 3/12(.25) *Note: BCD and BAD are 2 kinds of analytic bootstrap procedures.

Farwell’s response:

What would have happened.. • …If somebody beat the test? • Would he pay the $100,000? • No worries about that if you are 100% confident that it can’t happen (‘cuz you rigged it!)

Anyway, that was the abstract he sent to SPR ’08, published in the program which we all eagerly awaited. But here is what was posted:

NEW COMPLEX TRIAL PROTOCOL (ctp)

New Complex Trial Protocol (CTP) • 2 stimuli, separated by about 1 s, per trial, • S1; Either P or I…..then…..S2 ; either T or NT. *There is no conflicting discrimination task when P is presented, so P300 to probe is expected to be as large as possible due to P’s salience, which should lead to good detection; 90-100 % in Rosenfeld et al.(2008) with autobiographical information. It is also CM resistant. (Delayed T/NT still holds attention.) * “I saw it” response to S1. RT indexes CM use.

Main Study. Within-subject correct detections of guilty subjects based on bootstrap comparison of probe P300 against the average of all irrelevant P300s over 3 weeks. • WEEKHit Rate [Hit Rate] • Week 1 (no CM): 11/12 (92%)[12/12*( 100%)] • Week 2 (CM): 10/11 (91%) [11/12* (92%)] • Week 3 (no CM): 11/12 (92%) [12/12* (100%)] • Main Study: With false positive(FP) group. • Confidence=.9 Confidence=.95 • TestFPsHitsA’FPsHitsA’ • Iall .08 .92 .95 0 .92 .98 • Imax 0 .92 .980 .92 .98

EXP 1:How does this CTP do in detecting incidental mock crime details? • Subjects were divided into three groups (n=12) • Simple Guilty (SG), Countermeasure (CM), and Innocent Control (IC) • All subjects first participated in a baseline reaction time (RT) test in which they chose a playing card and then completed the CTP using cards as stimuli. • SG and CM subjects then committed a mock crime. • Subjects stole a ring out of an envelope in a professor’s mailbox. Subjects were never told what the item would be, to ensure any knowledge would be incidentally acquired through the commission of the mock crime. • All subjects were then tested for knowledge of the item that was stolen. There were 1 P (the ring) and 6 I( necklace,watch,etc). • CM subjects executed covert assigned responses to irrelevant stimuli in an attempt to evoke P300s to these stimuli to try and beat the Probe vs. Irrelevant P300 comparison.

A CTP Trial

Results: Grand Averages: SG, CM, IC, all P

Guilty Diagnoses • Condition Detections Percentage SG 10/12 83 CM 12/12 100 IC 1/12 8

RTs to S1 (P or I)

Conclusions • As with autobiographical information, the CTP was found to be highly sensitive at detecting incidentally acquired concealed knowledge in a mock-crime scenario. • Detection rates using the CTP compare favorably to similar polygraph CITs. The main advantage of the CTP over the old P300 or polygraph CIT is its resistance to CM use. The traditional covert-response CMs used to defeat past P300 CITs were found to be ineffective against the CTP, and actually led to larger Probe-Irrelevant amplitude differences and detection rates. • CM use was also easily identified by a large increase in RT between the baseline and experimental blocks.

New study with autobiographical. Info, 2 mental CMs to 4 irrelevants. • So now we have a 5-button box for the left hand. The subject is instructed to press, at random*, one of the 5 buttons as the “I saw it” response to S1 on each trial with no repeats. T and NT (S2) stimuli and responses are as previously. • We also hoped that this would make CMs harder to do. It didn’t, but we caught the CM users anyway. • * We have done other studies with non-random, explicitly assigned responses also.

Design: • Autobiographical information (birthdates): One P and 4 I (other, non-meaningful dates). *3 Groups as before: SG,CM, IC. *NEW: mental CMs to only 2 of the 4 Irrelevants: Say to yourself your first name was the CM1, your last name as CM2. These are assigned prior to run. *Why 2 irrels? Meixner &Rosenfeld(2010) showed countering all Irrels, not probe gives probe extra, special significance. They did a study with only 5 irrels, one of which was not countered. It had big P300. So doing CMs to all irrels is not a good strategy from perp’s perspective. *Why mental CMs? They should be faster and a bigger challenge for our CTP. • Only one block per group (no baseline).

Results: Grand Averages (Pz, 2 uV/ division)

Detection rates: • GroupBT/Iall.9BT/Imax.9 SG 13/13 (100%) 13/13 (100%) IC 1/13 (7.6%) 1/13 (7.6%) CM 12/12 (100%) 10/12 (83%)* *These are screened via RT, which still nicely represents CM use within a block.

RTs (to “I saw it”) in this study clearly index use of CMs:

New ERP: “P900—the CM potential” :largest at Fz, Cz(P=black, Iall=red, 2uV/division)

New study: Effects of various numbers of CMs, 1-5, with 5 total stimuli Elena Labkovsky & Peter Rosenfeld

GAs: SG, IN, 1, 2, 3, 4, and 5 CN groups SG 1CM 2CM 3CM 4CM 5CM

A Mock Terrorism Study John Meixner & Peter Rosenfeld How do you catch bad guys before crimes are committed, and before you know what was done, where, when?

A Mock Terrorism Application of the P300-based Concealed Information Test Department of Psychology, Northwestern University, Evanston, IL 60208-2700

Table 1. Individual bootstrap detection rates. Numbers indicate the average number of iterations (across all three blocks) of the bootstrap process in which probe was greater than Iall or Imax. Blind Imax numbers indicate the average number of iterations in which the largest single item (probe or irrelevant) was greater than the second largest single item. Mean values for each column are displayed in bold above detection rates.

So…………. • CTP is a promising, powerful paradigm, against any number of CMs, mental and/or physical and RT reliably indicates CM use. The new “P900” might also. • jp-rosenfeld@northwestern.edu

So far, all CMs are done separated from and before “I saw it” response. • Separated or split away from are called “splitting CMs”. • What happens if subjects are instructed to do CM and “I saw it” response at the same time? They lump these acts together. This is called “Lumping CMs.”

Here’s what happens: P3 still detects (83%) P vsIall (b), but RT no longer indicates CMs!!

Note that this means you can no longer screen irrelevant comparison waves associated with large RTS. • Xiaoxing Hu to the rescue! (with Dan Hegeman and Elizabeth Landry). • He simply increased irrelevants from 4 to 8, which should increase demand and RT…

Here are RT results with 8 irrels and 2,4,6 lumping CM groups here combined

RTs sorted by lumping CM groups.

Tabulated data…the more CMs you do, the harder the task, the more likely that RT will expose even lumping CM use…

P300 still catches CM users…

We were actually able to do some screening with 6-CM subjects which improved hit rate to 77%, A’ to .91

Remember, Allen Hu gave the CMs to Ss in advance and let them rehearse. • And his subjects were geniuses, like you all…

So we are now working with 10 Irrelevant items… and 3,5,7 CMs.

BUT… • … it is obvious that having to form—on the spot-- and hold 6 CMs for 6 of 8 Irrels in your head –as must happen in the field--is probably too hard for most bad guys to do.

New Enhancements for incidental information detection.

One Enhancement: • Effects of feedback that focuses attention on probe-irrelevant dimension. +First we tried 3SP to follow up Verschuere et al, (2009). +Probes were home towns. +Two groups: One got “You are lying” feedback (deception group). • The other (control group) received feedback about button pressing.

3SP-effect of deception awareness. P-I bigger in deception group.

3SP-effect of deception awareness

OK, but is 3SP the ideal protocol? Why not? What is? Yes, the CTP So in the next study, we use CTP with home town names (less salient), and feedback is about recognition in high awareness group (like previous “deception group.” In control group, feedback is about irrelevancies: if they are holding still, not blinking, etc..

OLD 3-STIMULUS, P300-BASED CIT (GKT)