460 likes | 590 Views
E N D
歡迎AlohaBienvenidos WelkomBem-vindoWelcomeWilkommen मेंआपकास्वागतहैKaselehlieMogethinRan AnnimLenwo Alii Yokwe HafaAdaiMauri Bienvenue欢迎СайнбайнаууSelamatحيبMenyambut สวัสดี歡迎Aloha Bienvenidos Welkom Bem-vindoWelcome Wilkommen मेंआपकास्वागतहैKaselehlie Mogethin Ran AnnimLenwo Alii Yokwe HafaAdaiMauri Bienvenue欢迎СайнбайнаууSelamatحيبMenyambut สวัสดี歡迎Aloha Bienvenidos Welkom Bem-vindoWelcomeWilkommen मेंआपकास्वागतहैKaselehlie Mogethin Ran AnnimLenwo PaAlii Yokwe HafaAdaiMauri Bienvenue欢迎СайнбайнаууSelamatحيبMenyambut สวัสดี歡迎Aloha Bienvenidos WelkomBem-vindoWelcomeWilkommen मेंआपकास्वागतहैKaselehlie Mogethin Ran AnnimLenwo Alii Yokwe HafaAdaiMauri Bienvenue欢迎СайнбайнаууSelamatحيبMenyambut สวัสดี歡迎Aloha BienvenidosWelkom Bem-vindoWelcome Wilkommen मेंआपकास्वागतहैKaselehlie MogethinRan AnnimLenwo AliiYokwe HafaAdaiKiribati – Mauri Bienvenue欢迎СайнбайнаууSelamatحيبMenyambut สวัสดี歡迎Alohaحيب Welcome to… The 5th session of the Social Science & Conservation Training Series WebEx June 25, 2012
Quantitative Data Analysis in Social Science I & II Session I: June 25 Session II: July 25 Americas: 10 am HT | 1 pm PT | 2 pm MT | 3 pm CT | 4 pm ET Asia-Pacific: 9 am Beijing | 10 am Palau | 11 am Brisbane
Your hosts Rebecca Shirer Social Science Coda Fellow Conservation Scientist, Eastern New York Chapter Supin Wongbusarakum Senior Social Scientist Conservation Methods Central Science
Today’s Flow Important elements and considerations before data analysis begins Best practices in data management Resources and Q&A
Learners’ Objectives • Understand how quantitative data can be used • Know what is important to assess a dataset • Apply best practices in working with data
Questions To ask the presenters a question, please send your question • via the WebEx chat window to Supin, OR • email to swongbusarakum@tnc.org If your question is directed to a specific speaker, please indicate Supin or Becky.
Data Groups of observations are called data, which may be qualitative or quantitative.
Ethical principles in obtaining data • Respect -- “Free, prior and informed consent“ (FPIC) • No harm may be done to the participants – anonymity and confidentiality • Justice--subjects should not be selected based on a compromised or manipulated position Oxfam Australia. 2010. Guide to Prior and Free Informed Consent.http://www.culturalsurvival.org/files/guidetofreepriorinformedconsent_0.pdf National Institute of Health (NIH), Office of Human Subject Research. 1979. The Belmont Report: Ethical Principles and Guidelines for the Protection of Human Subjects of Research. http://videocast.nih.gov/pdf/ohrp_belmont_report.pdf
What is data analysis? “The goal is to transform data into information, and information into insight.” –Carly Fiorina • Data analysis is the process of turning data into information • Good analysis communicates something meaningful about the world • What we measure and report is a statement about our beliefs and values
Considerations in Data Analysis • Begin early--at the design of the study • Prioritize learning and quality • Verify and validate results when possible • Share findings with those your collect data from. • Do not manipulate the results to the end-users’ expectations • Manage data
The quantitative method • Numbers • Data matrix • Analyzed by statistical methods
Strengths of Quantitative Research: • Precise, quantitative, numerical data • Testing hypothesis/confirming theories • Large population and numbers of variables possible • Generalizing finding, random samples with sufficient size • Facilitate systematic comparisons across groups, categories and over time • Comparatively quick data collection • Less time consuming analysis (using statistical software) • May minimize personal bias • Explanation of (causal) relationships between social phenomena.
Weaknesses of Quantitative Research: • Only applicable for measurable (quantifiable) phenomena • Confirmation bias • Simplifies and ”compresses” the complex reality, lack of detailed narrative • Theories or categories might not reflect local constituencies’ understandings • Too general for direct application to specific local situations, contexts, and individuals
Telling stories with data • Facts - TNC has protected 113,283,164acres of land, which is greater than the entire state of California • Trends and patterns – The number of acres TNC has protected has increased by 13% since 2006 • Comparisons – TNC has protected more land in Meso- /South America than in the U.S. and Canada combined • Relationships– Protected areas increase the value and desirability of surrounding lands (McConnell and Walls, 2005)
Data can take many forms • The U.S. population (2010) is 310,232,863 people • The U.S. is the third most populous country in the world • The U.S. population growth rate is 0.963% • Roughly 8 children are born in the U.S. every minute Numbers or statements Lists Tables Graphs Pie Bar Line Scatterplot Maps Infographics
Data can take many forms Lists Numbers or statements Tables Graphs Bar Line Scatterplot Maps Infographics The ten most populous countries: • China • India • United States • Indonesia • Brazil • Pakistan • Bangladesh • Nigeria • Russia • Japan Photo Credit: Robert C. Hass
Data can take many forms Tables Lists Numbers or statements Graphs Pie Bar Line Scatterplot Maps Infographics
Data can take many forms Pie charts Tables Lists Numbers or statements Bar Line Scatterplot Maps Infographics
Data can take many forms Bar graphs Pie charts Tables Lists Numbers or statements Line Scatterplot Maps Infographics
Data can take many forms Line graphs Bar graphs Pie charts Tables Lists Numbers or statements Scatterplot Maps Infographics
Data can take many forms Scatterplots Line graphs Bar graphs Pie charts Tables Lists Numbers or statements Maps Infographics
Data can take many forms Maps Scatterplots Line graphs Bar graphs Pie charts Tables Lists Numbers or statements Infographics
Data can take many forms Infographics Maps Scatterplots Line graphs Bar graphs Pie charts Tables Lists Numbers or statements
Before you begin • What information do you need? • Why do you need it? How will it be used? • Who is the audience?
Get to know your data • What kind of data is it? • Who collected it? Why? • How was it collected? • What was the sampled population? Sample size? • Is it clean or dirty?
Data type limits your analysis options • Continuous: Can take any value on the numberline, with meaningful scale and ratios (e.g. years worked at TNC) Report mean and variance, t-tests, regression • Categorical: Can only take a fixed set of possible values • Nominal – Categories cannot be ordered (e.g. favorite beverage) • Binary data – has only two categories (e.g. Yes/No) • Ordinal - Categories have an order but do not describe degree of difference (e.g. rating on a scale) Report median, mode, percentages, contingency tables
Get to know your data • What kind of data is it? • Who collected it? Why? • How was it collected? • What was the sampled population? Sample size? • Is it clean or dirty?
Sources of data • Self • Other parts of TNC • Consultants • Partners • Government agencies • Research institutions • Internet
Get to know your data • What kind of data is it? • Who collected it? Why? • How was it collected? • What was the sampled population? Sample size? • Is it clean or dirty?
Study designs • Experimental • Randomized treatment/control • Observational • Quasi-experimental • Pre-/Post-treatment • Paired or un-paired • BACI • Longitudinal • Following a cohort over time
Data collection methods • Direct • Measurement • Survey • Interview • Observation • Precision vs Accuracy • Know what tool or instrument was used • Indirect • Calculation • Document analysis • Inference • Manipulation
Get to know your data • What kind of data is it? • Who collected it? Why? • How was it collected? • What was the sampled population? Sample size? • Is it clean or dirty?
A note about statistics • Descriptive statistics = analysis that helps describe or show the data in a meaningful way • Inferential statistics = drawing generalizable conclusions from data subject to random variation • Statistical significance = unlikely to have occurred by chance
Sample size • Sample size = the number of sample units • How big a sample you need depends on the variance in the population and the precision you need • Larger samples increase the chances that the estimated value is within a certain range of the true value
Defining the population Population = the group about which you want to make an inference Examples: residents of a town, foresters, urban teens Sample frame = the actual set of units the sample was drawn from Example: randomly picking names from a phone book Who might be missed? • Cell phone users • Unlisted numbers • Homeless Population O OOO O O OOO O OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO O Sample frame
Sampling methods • Census - Every unit counted • Random - Units randomly selected from the population • Stratified • Clustered • Non-random • Purposive • Convenience
Get to know your data • What kind of data is it? • Who collected it? Why? • How was it collected? • What was the sampled population? Sample size? • Is it clean or dirty?
Data quality and sources of error • Sample bias • Not every unit in the sample had an equal chance of being included • Non-response error • Non-respondents may differ from respondents • Measurement error • Problems with the measurement instrument • Recording and transcription error • May occur during initial collection or subsequent transfer of data • Calculation or coding error • A derived value is incorrectly generated
Data clean-up Things to look for: • Missing or duplicate records • Missing data • Values out of range • Varying format • Inconsistencies in text fields • Data in the wrong field
Data management best practices • Unique records • Every piece of data should be linked to a unique identifier • Example: Interviewee initials - John Adams and Jane Andrews • Real zeros vs no response • Absence of data or data on absence? • Backups • Copy paper forms, backup digital copies before making edits • Metadata • Keep information on collection methods with the data • Analysis log • You WILL forget what you did – write it down!
Tools and resources • Excel – Next session Session II: July 25 Americas: 10 am HT | 1 pm PT | 2 pm MT | 3 pm CT | 4 pm ET Asia-Pacific: 9 am Beijing | 10 am Palau | 11 am Brisbane • Relational data: Access, Oracle • Statistics software: JMP, R, SAS, SPSS, Minitab • Spatial data: ArcGIS, Google Earth
http://www.conservationgateway.org/subtopic/integrating-human-well-being-conservationhttp://www.conservationgateway.org/subtopic/integrating-human-well-being-conservation
Questions and Answers Thank you!
Participant Survey • How many years have you worked at the Conservancy? • Open-ended • How often do you work with data in your job? • Never; Rarely; Sometimes; Often; Very Often • Rate on a scale of 1-5 (1=novice, 5=expert) how proficient you are in using Excel? • What is your favorite work-time beverage? • Coffee; Decaf coffee; tea; soda; water • How did you commute to work this morning? • Open-ended