Valuating Privacy

Valuating Privacy Eytan Adar (with Bernardo Huberman and Leslie Fine) WEIS’05, June 3, 2005

How this started… • Well duh… • SHOCK • Social Harvesting of Community Knowledge • Locally built profile, used for targeting messages • Users demanded more control • Add and remove profile components • But, they never touched the feature • What people say versus what they do

Before I go any further… • For those of you who haven’t read the paper yet: • How much would I have to pay you for your weight? • Your age?

The Goals: What people say and do • Track what people say, tie it to what they really do • Who cares about privacy? • How much do they care? • Can we really figure out how much information is worth? • Why do they care? • Can we figure out why people price information in a certain way?

Related Work • Survey/modeling based techniques • Jourard self-disclosure test (Jourard, Cozby, Dindia) • Petronio, 2000 • Acquisti and Grossklags (various) • Hann et. al. 2003 (Internet survey) • Jupiter, 2002: 70% consider privacy important • Wolfgang, et. al. “Exploration of Attitudes via Physical Interpersonal Distance Towards the Obese, Drug Users, Homosexuals, Police, and Other Marginal Figures.” (1971) • Many more (see paper) • But, surveys don’t tell you what people do • Just what they think they’ll do • Usually just have a reward and no cost • Here’s $n for doing the survey

Related Work • Data based studies • Give me your data I’ll give you a cookie • Hard to find good data • Not verified/verifiable • Too many confounders – confuse the why • Self selecting • Did they have something happen to them before? • Did something happen to their friends? • Did they see something on TV?

Our Approach • Remove as many confounders as possible • Introduce reward and cost • Privacy calculus • Houston et al. 1987, Altman et al. 1973, etc. • Reveal? reward – cost > 0 • Pay for information • Force revelation of that information • Adapt approach from behavioral economics/psychology to find where reward really = cost for individuals

Our Approach • Want to study the “why” • Hypothesis: Further people are from mean, the more they will demand for private information • Use real valued information with notion of relativity • Not SSN, CC#, etc.  these are binary (more or less) • Instead: Weight, Age, GPA, Salary, etc. • Original title: Privacy and Deviance

Deviation Hypothesis Mean Continuous private information

Deviation Hypothesis Price Mean Continuous private information

Our Approach • The Why: • Hypothesis: Further people are from mean, the more they will demand for private information • Use real valued information with notion of relativity • Not SSN, CC#, etc.  these are binary (more or less) • Instead: Weight, Age, GPA, Salary, etc. • Original title: Privacy and Deviance • Not quite right

Winner Paid The Experiment • Groups of 10 - 15 • $25 for showing up • Put them in the same room (around a table) • Perform • Reverse, • second price, • sealed bid auction …. $5 $3 $2 $1 $.01

The Experiment • Groups of 10 - 15 • $25 for showing up • Put them in the same room (around a table) • Perform • Reverse, • second price, • sealed bid auction • Data: Weight, Age, GPA • Winner stands up and reveals information and receives payment (range $0-$100, infinity) • Information is validated (scale, driver license, login) • Survey

The Experiment ID: 8282828 Weight: Height: Price: Gender: ID: 8282828 ID: 8282828 Age: Gender: Price: ID: 8282828 Only have 1 winner per session, but 10+ data points

Survey Questions • Also coded with ID • Sanity check: • If you are reluctant to reveal information in this study, you should: • A) List a very low price • B) List a very high price • C) Leave the price blank • General privacy attitude (how important is privacy to you?) • Rank different kinds of information (financial, medical, etc.) by importance.

Survey Questions • Weight • Do you feel very underweight, slightly underweight, average, slightly overweight, overweight given others in the room • What do you think the average weight is? (your gender) • Do you think the average (for your gender) is very underweight, slightly underweight, average, slightly overweight, overweight given others in the room • Same for age • Familiarity with others in the room • How many do you know well, recognize, etc.

Survey Questions (Simulated Exp)

The Subjects • 127 subjects • 59% male • 10 sessions • Recruited from HP and from an area experimental economics mailing list (mostly Stanford students) • Allowed to leave at any point • Signed a waiver • Sessions were mixed, male only, female only, male proctor, female proctor

The Good Stuff… • First of all (mea culpa) • GPA was too hard • Hard to prove • Stanford students • Pretty smart anyway • Grade inflation? • Didn’t really care • Weight (actually BMI=function(weight,height)) • Best example • Only 7 individuals demanded “infinity” • 6 were women

Results: Log Price versus BMI 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 0 20th 40th 60th 80th 100th BMI Percentile p-value = 0.018

The Good Stuff… • Encouraging for deviance argument • But… • Recall perception question • Do you feel very underweight, slightly underweight, average, slightly overweight, overweight given others in the room?

Results: Perceived Weight versus Price • People who are very underweight don’t consider themselves to be that 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 p-value = 0.0038 0 Very Underweight Somewhat Underweight Average Somewhat Overweight

Deviance • So deviation from “mean” is not quite right • Something underlying • Desirable versus undesirable • Self-perception • Consistent with sociological work • Goffman (self-representation, stigmas) • Simmel • See paper for more.

The Good Stuff… Age Log Price, age groups binning, 88 participants • Not so encouraging... • Maybe students don’t care about privacy? • $57 vs $74 • But… lowest bucket ($3.62) versus highest ($18.05), p=.0297 • Might mean that middle age groups don’t care… 3 0 20th 40th 60th 80th 100th P=0.17

What about gender? • Suggestive, but weak stats • Take these with a grain of salt • Mixed sessions versus single sex • Men have slightly higher prices for single sex sessions • p = 0.24 • Women have slightly higher prices in mixed sessions • p = 0.39 • In general, we can’t tell the difference • mixed versus all women versus all men • p = 1

Gender Differences – Men Log Price vs BMI 3 1.5 p-value=.01 0 20% 100% 40% 60% 80%

Gender Differences – Women Log Price vs BMI • Top 50% versus bottom, clearer • p-value = .16 3 1.5 p-value=.42 0 20% 100% 40% 60% 80%

Gender Differences – Men log price 3 1.5 0 Very Under Somewhat Under Average Somewhat Over

Gender differences – Women, log price 3 • p-value = .2 • Just somewhat over, and somewhat under, p-value = .099 • Lesson: women demand more across most categories • need more than $100? 1.5 0 Somewhat Over Somewhat Under Average

3 1.5 0 100% 40% 60% 80% 20% Friends versus strangers in the room (BMI) • Coded up friendship with scores • Know well = 4, • Acquainted = 3, etc. • More people you know, the more you charge • Not so strong, p = .34 • Looking at top 50% and bottom much stronger • 36% versus 23% • p = .05 • “Phenomena of the stranger” (Simmel)

Salary, credit rating, savings • Much more sensitive • Hard to say something concrete • If we did it again: • should have expressed no framing limits

Price versus Privacy Attitude • Not too bad, p-value = .054 • Should be taken with a grain of salt • Question asked after auction • Subjects may have been matching answer to their behavior (priming) 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 0 Critical Very Important Somewhat Important Not Important

Observations (if you want to try this) • Auction is always cheap… • max payment was <$1 • Usually a few pennies • Normalization is important • Weight versus BMI • Year on job, occupation, etc. • Critical to know who subjects are comparing themselves against • Lots of ways to cut up the data • Worry about data dilution • Fix more variables • Cultural issues? • Would love to see this study in some other country

Summary and Afterwards… • Well duh… • We didn’t set out to prove the obvious… • Demonstrate a technique that yields actual numbers and removes confounders • Experiments with real-valued data demonstrate something about certain kinds of private information • Those influenced by group norms/attitudes • Possibly important in the design of survey based studies • Results indicate correct amount to offer • Offering less means you get a biased sample • Raises new privacy issues? • Maybe possible to reverse • Given that you want $x, probability is you weight y

Summary and Afterwards… • If you write a paper on weight you will end up on pro-ana (anorexia) blogs • If you write a paper like “Privacy and Deviance” your work will appear on fetish websites

HP logo

Valuating Privacy

Valuating Privacy

Presentation Transcript

Privacy

Privacy

Creating Chat Connections: E-valuating Virtual Reference Transcripts

Privacy

Privacy

Privacy

Privacy

Privacy

Privacy

PRIVACY

Privacy

Privacy

Privacy

Privacy

E VALUATING P OLYNOMIAL F UNCTIONS

Privacy

E valuating Governance and Decentralization in Indonesia

(E)valuating VLE

Privacy

PRIVACY

Privacy