350 likes | 386 Views
Data for Decision-Making. Module 3: Introduction to the Data Lifecycle – Collecting Data. Overview. Review key concepts Introducing the data life-cycle Case studies Activity 3.1 Understanding data collection methods Key considerations in designing a data collection plan Lying with data
E N D
Data for Decision-Making Module 3: Introduction to the Data Lifecycle – Collecting Data
Overview • Review key concepts • Introducing the data life-cycle • Case studies • Activity 3.1 • Understanding data collection methods • Key considerations in designing a data collection plan • Lying with data • Debrief
Review • Data • Data for decision-making • Stakeholders • Assessment • Mission • Vision • Data producer • Data consumer
Steps to Using Data for decision-making • Identify a problem or research question • Assess data available to you and your data needs • Identify stakeholders • Plan for how data will be used, analyze, and shared
Introducing key concepts • Lifecycle
Introducing key concepts • Collecting data • Analyzing data • Sharing data
Introducing key concepts • Data collection: data collection is the process of gathering information in a systematic way. Collected data are generally intended to answer questions and/or evaluate outcomes. • Data analysis: data analysis is the process of inspecting, cleansing, transforming, and visualizing data with the goal of discovering its useful information, suggesting conclusions, and supporting decision-making. • Data analysis is made up of many stages: • Inspecting the data • Cleaning the data • Transforming the data • Visualizing the data (visually)
Introducing key concepts • Data sharing: data sharing is the process of making data that are used in problem solving, research, or evaluation available to others.
Understanding Data Collection Methods • Primary data: information collected by you or your team. • Secondary data: information that is collected by a third party.
Primary Data Sources • Quantitative • Surveys • Experiments • Observation • Qualitative • Interviews • Focus groups • Observation • Case studies
Secondary Data Sources • Journals • Books • Newspapers • Records • Previous reports and analyses
Introducing Key Concepts • Protocols are systematic plans for how a set of operations are to be carried out • Data protocols are systematic plans for how data are to be collected, stored, and described
Introducing key concepts • Metadata: information that describes, explains, or gives context for other data. They are provided to make it easier to interpret, use, and manage data.
Introducing key concepts • Metadata are important because they are used to add context to data. Metadata are the key for primary data to be used as secondary data. Metadata can be: • Descriptive metadata (Such as who created the data, what was the data created for, where was the data collected, and when the data was collected) • Administrative metadata (Why these data were collected)
Key Considerations in Designing a Data Collection Plan • What questions or problems are trying to be addressed? • What do you need to know? • When to collect new data (primary), and when to use existing data (secondary)? • What instruments will you need to create? • Who will be involved in data collection, and for how long? • What documentation will be needed to use the data again?
Introducing key concepts Sample resources: • Data Management Plan tool: https://dmponline.dcc.ac.uk/ • Following best practices in choosing a sample (size, diversity, relevant population, etc.) https://resolutionresearch.com/page/results-calculate/ • Searching for secondary data http://datasupport.researchdata.nl/en/start-de-cursus/iv-gebruiksfase/zoeken-naar-data/ • What are databases? How to design one? www.dartmouth.edu/~bknauff/dwebd/2004-02/DB-intro.pdf
Lying with Data • The same data can easily be manipulated and used to tell different, opposing, or inaccurate stories participant introductions • Intentional or unintentional misuse • We can face many pressures in our work • Respectfully talk with your colleagues and supervisors if you feel data are being used incorrectly or inappropriately
Lying with Data • There are many ways data can be used to mislead • Always keep your data radar active! • A few examples:
Lying with Data • Correlation vs. causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • A correlation describes a relationship between two or more variables. It does not, however, mean that one variable impacts the other. • Causation shows that the change in one variable is the result of a change in the other. In other words, a change in one causes a change in the other.
Lying with Data Source: http://www.tylervigen.com/spurious-correlations
Lying with Data Source: http://www.tylervigen.com/spurious-correlations
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data Average number of weekly hours worked at main job Source: http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data Source: http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data • Misleading visualizations • A good resource is http://callingbullshit.org/tools/tools_misleading_axes.html
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Correlation vs causation • Misleading visualizations • Using bad data • Selective storytelling
Lying with Data • Key takeaways • Misusing data can be intentional or unintentional • Be careful with your data and what conclusions you state from it • Do not manipulate your data to fit the story you want to tell and not be open to other stories • Be critical of how others use data • Be honest if you make a mistake