190 likes | 327 Views
About Survey Design. JJP Teen Insight Survey On Parental Authority August-September 2013. Genesis. Surveys are important. Most decisions are made with the insights gained by surveys Teens rarely get the opportunity to design their own survey
E N D
About Survey Design JJP Teen Insight Survey On Parental Authority August-September 2013
Genesis • Surveys are important. Most decisions are made with the insights gained by surveys • Teens rarely get the opportunity to design their own survey • I collaborated with two teenage girls, 16 & 13, in CT on a survey project meant to be a leaning experience for them • They selected the subject of the survey, Parental Authority • Together we designed the questions • I coded the on-line survey using esurveyspro.com • Recruitment was difficult, but we managed to get 200 completes
Some Statistical Terms and ConceptsCategorical vs Measured Data • Surveys can collect three kinds of data • Categorical data: counts of the number of respondents that selected a specific answer from the set of possible values of a question’s answers. Also known as count data • Measured data: an answer that is a measurement, e.g., height in inches, weight in grams, etc. Note: if measurements are grouped into ranges, e.g., 2.00 to 2.99 inches, then the data becomes categorical. • Verbatim: Limited size, free form text composed by respondent • Population size vs Sample size • A survey “samples” a small percentage of a much larger population with the intent of characterizing the larger population. Example: a survey of 200 teens is a small sample of the larger population of three million teenagers in the U.S.
Examples Categorical Data Measured Data • Typical Questions • Age expressed as integers • Gender • Religion • Marital status • Color • Political party • Parental style • Industry • Level of education • Height • Weight • Time of day • Distance • Depth • Speed • Elapsed time • Wave length • Temperature
Types of Statistics Categorical Data Measured Data • Frequencies (counts) • Percentage distribution • Median and Mode • Cross tabs (chi-squared test) • Decimal numbers • Mean (Average) • Median • Variance • Correlation
Example: Frequencies of Categorical Data Data as number of respondents Source: JJP Teen Insights 2013-1 Q2 N=197
Example: Frequencies of Categorical Data Data as % of n Source: JJP Teen Insights 2013-1 Q2 n=197
Confidence Levels • Confidence levels are chosen, e.g., 95% ( most common) • 95% confidence level means that out of 20 samples, 19 will be reliable, but you don’t know which one was not. You’re only doing one sample, but you know there is a 95% chance the sample will be reliable • The choice of confidence level impacts the relationship between sample size and margin of error described in the next slide
Margin of Error, aka Confidence Interval • Statistically, “margin of error” is a function of sample size, population size and confidence level • For a very large population, we can discuss just sample size and confidence level. For a 95% confidence level: • Sample 150, margin of error is plus or minus 8.00% • Sample 200, margin of error is plus or minus 6.93% • Sample 300, margin of error is plus or minus 5.66% • Sample 600, margin of error is plus or minus 4.00% • Sample 2400, margin of error is plus or minus 2.00% • Example: A sample 600 shows that Obama is ahead of Romney with 44.6% of the vote plus or minus 4%, meaning that the actual value somewhere between 40.6% and 48.6% • http://americanresearchgroup.com/moe.html
Cross Tabulation, aka Cross Tabs • Cross Tabs provide a powerful analytic tool • Show how the answers to the first question affect the distribution of answers to the second question • Can only be done with categorical data (very important) • Combines two questions, one for row headings and one for column headings. • Counts for each cell correspond to the number of respondents that answered both questions selecting the choice of variables whose intersection defines the cell
n=168 Source: JJP Teen Insights 2013-1 Survey on Parental Authority
Chi-squared Test on Crosstabs tests for Statistical Significance Chisquared Test p=.00065
Survey DesignAnalysis Plan • The Analysis Plan is the most important part of the process • What is the theme and the key issues? • What population are we sampling? • How important is the precision? What does that imply with regard to “margin of error,” “confidence level,” and sample size? • What form (online, telephone, face-to-face) • What cost for sample size and form • Banner values (key variables for crosstabs) • Use Google to find prior research to get ideas for questions
Survey DesignBasics • Start with a welcome and a statement of intent, and possibly a statement to motivate the respondent • Get the most important demographic data early in the survey • Mix the form of the questions to prevent monotony • Limit the use of matrix questions • They fatigue respondents, causing drop out • They complicates subsequent analysis, especially crosstabs • Add “don’t know” and “other” to answer list when appropriate • Use progress bar and avoid excessive length (test the elapsed time to take the survey) • Use built-in, real-time answer validation checks • Carefully edit the survey for spelling, grammar, etc. • Test the survey multiple times before opening to public
Survey DesignGetting Beyond Yes/NoDo you use facebook? Yes/No
Fine tuning questions • Don’t know and Other may be added to most questions • Questions can be made mandatory • On some questions, a quality test may be used in real time, e.g., date, email address, decimal, integer for freeform (verbatim) answers • Presentation of answer choices may be randomized to eliminate bias. E.g., Christian, Roman Catholic doesn’t’ always appear at the same place in the list.
Recruitment • If there is money in the budget, it can be used to buy appropriate lists from which random samples can be drawn • For on-line surveys, messages to facebook “friends” can be used. Also, posting on craigslist, or other blogs depending on the subject
Analysis and Reporting • Analysis should be organized around specific hypotheses. E.g., Parents are dictators. E.g., Most teenagers will obey their parents. E.g., Parenting styles have an effect on the child’s success and failure • Crosstab “theme-specific questions” with a standard set of banner variables, typically demographic info. Use the Chi-squared test on all crosstabs • Contrast simple variables, e.g., trust vs high risk behavior • Clearly note the flaws in the sample composition • Always state the sample size for each question • Only report the more interesting findings. Don’t present so much data that it obscures the key findings