360 likes | 481 Views
Valuating Privacy. Eytan Adar (with Bernardo Huberman and Leslie Fine) WEIS’05, June 3, 2005. How this started…. Well duh… SHOCK Social Harvesting of Community Knowledge Locally built profile, used for targeting messages Users demanded more control Add and remove profile components
E N D
Valuating Privacy Eytan Adar (with Bernardo Huberman and Leslie Fine) WEIS’05, June 3, 2005
How this started… • Well duh… • SHOCK • Social Harvesting of Community Knowledge • Locally built profile, used for targeting messages • Users demanded more control • Add and remove profile components • But, they never touched the feature • What people say versus what they do
Before I go any further… • For those of you who haven’t read the paper yet: • How much would I have to pay you for your weight? • Your age?
The Goals: What people say and do • Track what people say, tie it to what they really do • Who cares about privacy? • How much do they care? • Can we really figure out how much information is worth? • Why do they care? • Can we figure out why people price information in a certain way?
Related Work • Survey/modeling based techniques • Jourard self-disclosure test (Jourard, Cozby, Dindia) • Petronio, 2000 • Acquisti and Grossklags (various) • Hann et. al. 2003 (Internet survey) • Jupiter, 2002: 70% consider privacy important • Wolfgang, et. al. “Exploration of Attitudes via Physical Interpersonal Distance Towards the Obese, Drug Users, Homosexuals, Police, and Other Marginal Figures.” (1971) • Many more (see paper) • But, surveys don’t tell you what people do • Just what they think they’ll do • Usually just have a reward and no cost • Here’s $n for doing the survey
Related Work • Data based studies • Give me your data I’ll give you a cookie • Hard to find good data • Not verified/verifiable • Too many confounders – confuse the why • Self selecting • Did they have something happen to them before? • Did something happen to their friends? • Did they see something on TV?
Our Approach • Remove as many confounders as possible • Introduce reward and cost • Privacy calculus • Houston et al. 1987, Altman et al. 1973, etc. • Reveal? reward – cost > 0 • Pay for information • Force revelation of that information • Adapt approach from behavioral economics/psychology to find where reward really = cost for individuals
Our Approach • Want to study the “why” • Hypothesis: Further people are from mean, the more they will demand for private information • Use real valued information with notion of relativity • Not SSN, CC#, etc. these are binary (more or less) • Instead: Weight, Age, GPA, Salary, etc. • Original title: Privacy and Deviance
Deviation Hypothesis Mean Continuous private information
Deviation Hypothesis Price Mean Continuous private information
Our Approach • The Why: • Hypothesis: Further people are from mean, the more they will demand for private information • Use real valued information with notion of relativity • Not SSN, CC#, etc. these are binary (more or less) • Instead: Weight, Age, GPA, Salary, etc. • Original title: Privacy and Deviance • Not quite right
Winner Paid The Experiment • Groups of 10 - 15 • $25 for showing up • Put them in the same room (around a table) • Perform • Reverse, • second price, • sealed bid auction …. $5 $3 $2 $1 $.01
The Experiment • Groups of 10 - 15 • $25 for showing up • Put them in the same room (around a table) • Perform • Reverse, • second price, • sealed bid auction • Data: Weight, Age, GPA • Winner stands up and reveals information and receives payment (range $0-$100, infinity) • Information is validated (scale, driver license, login) • Survey
The Experiment ID: 8282828 Weight: Height: Price: Gender: ID: 8282828 ID: 8282828 Age: Gender: Price: ID: 8282828 Only have 1 winner per session, but 10+ data points
Survey Questions • Also coded with ID • Sanity check: • If you are reluctant to reveal information in this study, you should: • A) List a very low price • B) List a very high price • C) Leave the price blank • General privacy attitude (how important is privacy to you?) • Rank different kinds of information (financial, medical, etc.) by importance.
Survey Questions • Weight • Do you feel very underweight, slightly underweight, average, slightly overweight, overweight given others in the room • What do you think the average weight is? (your gender) • Do you think the average (for your gender) is very underweight, slightly underweight, average, slightly overweight, overweight given others in the room • Same for age • Familiarity with others in the room • How many do you know well, recognize, etc.
The Subjects • 127 subjects • 59% male • 10 sessions • Recruited from HP and from an area experimental economics mailing list (mostly Stanford students) • Allowed to leave at any point • Signed a waiver • Sessions were mixed, male only, female only, male proctor, female proctor
The Good Stuff… • First of all (mea culpa) • GPA was too hard • Hard to prove • Stanford students • Pretty smart anyway • Grade inflation? • Didn’t really care • Weight (actually BMI=function(weight,height)) • Best example • Only 7 individuals demanded “infinity” • 6 were women
Results: Log Price versus BMI 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 0 20th 40th 60th 80th 100th BMI Percentile p-value = 0.018
The Good Stuff… • Encouraging for deviance argument • But… • Recall perception question • Do you feel very underweight, slightly underweight, average, slightly overweight, overweight given others in the room?
Results: Perceived Weight versus Price • People who are very underweight don’t consider themselves to be that 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 p-value = 0.0038 0 Very Underweight Somewhat Underweight Average Somewhat Overweight
Deviance • So deviation from “mean” is not quite right • Something underlying • Desirable versus undesirable • Self-perception • Consistent with sociological work • Goffman (self-representation, stigmas) • Simmel • See paper for more.
The Good Stuff… Age Log Price, age groups binning, 88 participants • Not so encouraging... • Maybe students don’t care about privacy? • $57 vs $74 • But… lowest bucket ($3.62) versus highest ($18.05), p=.0297 • Might mean that middle age groups don’t care… 3 0 20th 40th 60th 80th 100th P=0.17
What about gender? • Suggestive, but weak stats • Take these with a grain of salt • Mixed sessions versus single sex • Men have slightly higher prices for single sex sessions • p = 0.24 • Women have slightly higher prices in mixed sessions • p = 0.39 • In general, we can’t tell the difference • mixed versus all women versus all men • p = 1
Gender Differences – Men Log Price vs BMI 3 1.5 p-value=.01 0 20% 100% 40% 60% 80%
Gender Differences – Women Log Price vs BMI • Top 50% versus bottom, clearer • p-value = .16 3 1.5 p-value=.42 0 20% 100% 40% 60% 80%
Gender Differences – Men log price 3 1.5 0 Very Under Somewhat Under Average Somewhat Over
Gender differences – Women, log price 3 • p-value = .2 • Just somewhat over, and somewhat under, p-value = .099 • Lesson: women demand more across most categories • need more than $100? 1.5 0 Somewhat Over Somewhat Under Average
3 1.5 0 100% 40% 60% 80% 20% Friends versus strangers in the room (BMI) • Coded up friendship with scores • Know well = 4, • Acquainted = 3, etc. • More people you know, the more you charge • Not so strong, p = .34 • Looking at top 50% and bottom much stronger • 36% versus 23% • p = .05 • “Phenomena of the stranger” (Simmel)
Salary, credit rating, savings • Much more sensitive • Hard to say something concrete • If we did it again: • should have expressed no framing limits
Price versus Privacy Attitude • Not too bad, p-value = .054 • Should be taken with a grain of salt • Question asked after auction • Subjects may have been matching answer to their behavior (priming) 2 1.8 1.6 1.4 1.2 Log of Price Bid 1 0.8 0.6 0.4 0.2 0 Critical Very Important Somewhat Important Not Important
Observations (if you want to try this) • Auction is always cheap… • max payment was <$1 • Usually a few pennies • Normalization is important • Weight versus BMI • Year on job, occupation, etc. • Critical to know who subjects are comparing themselves against • Lots of ways to cut up the data • Worry about data dilution • Fix more variables • Cultural issues? • Would love to see this study in some other country
Summary and Afterwards… • Well duh… • We didn’t set out to prove the obvious… • Demonstrate a technique that yields actual numbers and removes confounders • Experiments with real-valued data demonstrate something about certain kinds of private information • Those influenced by group norms/attitudes • Possibly important in the design of survey based studies • Results indicate correct amount to offer • Offering less means you get a biased sample • Raises new privacy issues? • Maybe possible to reverse • Given that you want $x, probability is you weight y
Summary and Afterwards… • If you write a paper on weight you will end up on pro-ana (anorexia) blogs • If you write a paper like “Privacy and Deviance” your work will appear on fetish websites