680 likes | 816 Views
Introduction to Basic Statistical Concepts for Science Teachers and Applications for Student Research Projects. Ryan Tolman March 9 th , 2013 Workshop presented at The Kohala Center HI-MOES Teachers Meeting Waimea , HI. Overview of Workshop Contents.
E N D
Introduction to Basic Statistical Concepts for Science Teachers and Applications for Student Research Projects Ryan Tolman March 9th, 2013 Workshop presented at The Kohala Center HI-MOES Teachers Meeting Waimea, HI
Overview of Workshop Contents Introduction to Basic Statistical Concepts for Student Science Class Projects In-Class Examples of Teaching Statistical Concepts Resources for Applying Statistical Decision-Making to Student Research Projects Resources and References
I. Introduction to Basic Statistical Concepts for Student Science Class Projects Purpose and Goals of the Workshop What are Statistics? (Definitions? Uses? Etc.) Review of Foundational Concepts in Statistics Statistics Throughout the Research Process
A. Purpose of the Workshop • “Science isn’t show and tell. It’s a test or an experiment where you get repeatable, demonstratable results.” • “How do we determine if the results are statistically significant?”
A. Goals of the Workshop • Learn basic concepts in statistics that are important to the research process. • Learn how statistics are applied throughout the stages of the scientific research method. • Provide hands-on examples of doing statistics to learn statistical concepts. • Determine what statistical analysis to use based on the research design. • Apply statistical analyses to examples of HI-MOES student research projects.
What Are Statistics? • Mathematical Statistics: procedures for dealing with numbers.
Much of Statistics is Actually Non-Mathematical • Study of the collection, organization, analysis, interpretation, and presentation of data. • Statistics deals with all aspects of the research process. • Planning of data collection in terms of the design of surveys and experiments.
Descriptive and Inferential Statistics • Descriptive Statistics: Methods to summarize or describe a collection of data. • Inferential Statistics: Statistical models that are used to draw inferences about the process or population under study. • Provides a way to draw conclusions from data that are subject to random variation. • Conclusions are tested as part of the scientific method.
Statistics and Probability Theory • Probability Theory: starts from the given parameters of a total population to deduce probabilities that pertain to samples. • Statistical Inference: moves in the opposite direction—inductively inferring from samples to the parameters of a larger or total population.
What Statistics Are to Me: • Problem-solving • A set of tools • Story telling
TerminologyPopulations & Samples • Population: the complete set of individuals, objects or scores of interest. • Often too large to sample in its entirety • It may be real or hypothetical (e.g. the results from an experiment repeated ad infinitum) • Sample: A subset of the population. • A sample may be classified as random (each member has equal chance of being selected from a population) or convenience (what’s available). • Random selection attempts to ensure the sample is representative of the population.
Variables • Variables are the quantities measured in a sample.They may be classified as: • Quantitative • Interval, i.e. numerical • Categorical • Nominal (e.g. gender, blood group) • Ordinal (ranked e.g. mild, moderate or severe illness). Often ordinal variables are re-coded to be quantitative.
Variables • Variables can be further classified as: • Dependent/Response. Variable of primary interest (e.g. blood pressure in an antihypertensive drug trial). Not controlled by the experimenter. • Independent/Predictor • called a Factor whencontrolled by experimenter. It is often nominal (e.g. treatment) • Covariate when not controlled. • If the value of a variable cannot be predicted in advance then the variable is referred to as a random variable
Parameters & Statistics • Parameters: Quantities that describe a population characteristic. They are usually unknown and we wish to make statistical inferences about parameters. • Descriptive Statistics: Quantities and techniques used to describe a sample characteristic or illustrate the sample data e.g. mean, standard deviation, box-plot
Measures of Central Tendency (Location) Measures of location indicate where on the number line the data are to be found. Common measures of location are: (i)the ArithmeticMean, (ii)theMedian, and (iii)the Mode
Measures of Dispersion • Measures of dispersion characterise how spread out the distribution is, i.e., how variable the data are. • Commonly used measures of dispersion include: • Range • Variance & Standard deviation • Coefficient of Variation (or relative standard deviation) • Inter-quartile range
Statistical Inference • Statistical Inference – the process of drawing conclusions about a population based on information in a sample
Statistical Inference Population (parameters, e.g., and ) select sample at random Sample collect data from individuals in sample Data Analyse data (e.g. estimate ) to make inferences
The Normal Distribution • The Normal distribution is considered to be the most important distribution in statistics • It occurs in “nature” from processes consisting of a very large number of elements acting in an additive manner • However, it would be very difficult to use this argument to assume normality of your data • Later, we will see exactly why the Normal is so important in statistics
Normal curve 0.68 0.95 0.997 - 3 - 1.96 - + + 1.96 + 3
Sampling distribution of Sample Means 95% of the ‘s lie between 95%
How close is Sample Statistic to Population Parameter ? • Population parameters, e.g. and are fixed • Sample statistics, vary from sample to sample • How close is the sample mean to the population mean? • Cannot answer question for a particular sample • Can answer if we can find out about the distribution that describes the variability in the random variable
Statistical Models • Statistical Models: • Fitting statistical models to data that represent the hypotheses that we want to test. • Use probability to see whether scores are likely to have happened by chance. • Testing Statistical Models: • Compare the systematic variation against the unsystematic variation. • In other words, how good the model/hypothesis is at explaining the data against how bad it is (the error): • Outcome = Model + error
Test Statistic = Variance/Unexplained Variance • Systematic and Unexplained Variance • Systematic variation: variation due to some genuine effect. • Unsystematic variation: variation that isn’t due to the effect in which the researcher is interested, variation that can’t be explained by the model. • Test statistic = [variance explained by the model/variance not explained by the model] = [effect/error] • Essentially, most statistical tests calculate the amount of variance explained by the model we’ve fitted to the data compared to the variance that can’t be explained by the model. • If the model is good, we would expect it to explain more of the variance in the data.
Workshop Activity #1: What Statistical Questions Are Asked During Each Stage of the Research Process?
Workshop Activity #2: Applying Statistics to Each Stage of the Research Process?
What Have We Learned So Far? • What Statistics Are • Deals with all stages of the research process • Statistical Inference • Key Concepts in Statistics • Sampling from a Population • Types of Variables • Measures of Central Tendency and Dispersion • Normal Distribution • Statistical Model and Test Statistic • Statistics Role Throughout the Research Process • Questions asked by statisticians in research • Applying statistics throughout the research process
II. In-Class Examples of Teaching Statistical Concepts Random Sampling w/ M&M’s Using Statistics to Test Hypotheses in Excel
A. Random Sampling w/ M&M’s • Why do researchers collect samples instead of measuring the entire population? • Why is it important that researchers collect samples randomly? • What is the connection between random sampling and statistics?
B. Using Statistics to Test Hypotheses in Excel • When there is a difference observed in the random samples collected by researchers, how can they tell that the difference is statistically significant? • Utilize the Chi-Square Goodness-of-Fit Statistic to Test a hypotheses regarding the frequency distribution of different colors of M&M’s.
What Did We Learn in This Example? • Association between concepts of random sampling in statistics and applications in research. • Difference between “descriptive” and “inferential” statistics. • Make the association between different stages of the research process and the application of statistics. • Learning statistical applications through hands-on examples.
III. Resources for Applying Statistical Decision-Making to Student Research Projects Statistical Decision Tree Statistics Calculators
A. Statistical Decision Tree • Statistical analyses can be thought of as a set of tools. • One must select the right tool for the job. • What information do you need to know to decide what statistical analysis to use?
What Information is Needed to Decide What Statistical Analysis to Use? • What type of research question are you asking (e.g., descriptive, test of association, testing differences)? • How many variables are being measured? • How many of the variables are independent or dependent variables? • What type of measurement data is being collected (e.g., nominal, ordinal, interval)? • How is the data structured? • How many samples are being collected? • Are the data normally distributed? • What is the sample size?
Basic Steps in Deciding What Statistics to Use • Determine what type of research question you are asking. • Determine how many variables you have. Which ones are independent dependent variables. • Determine what type of measurement scale your data is.
If you know what your research question is asking, you can often determine the statistical analysis • Descriptive: Describing a sample or a population • Comparing groups: Testing for differences between two or more groups. • Associations: Examining the relationships or links between two constructs of interest. • Predictive: Does increasing (or decreasing) the value on one measure effect the value of another measure.
Student Research Example • Research Question: Is there a difference in the abundance and diversity of fish close to shore and further from shore at Kahalu’u Bay? • Hypothesis: We think there will be more fish species in the water farther from shore because there is less human activity and more coral, providing a greater food source.
Online Resources for Deciding Which Statistical Analysis to Use • Tables • “Review Of Available Statistical Tests” http://www.graphpad.com/support/faqid/1790/ • UCLA Stata: What statistical test should I use? http://www.ats.ucla.edu/STAT/stata/whatstat/default.htm • Decision Trees • The Decision Tree for Statistics: http://www.microsiris.com/Statistical%20Decision%20Tree/default.htm • Social Research Methods Selecting Statistics Decision Tree: http://www.socialresearchmethods.net/selstat/ssstart.htm
http://www.microsiris.com/Statistical%20Decision%20Tree/default.htmhttp://www.microsiris.com/Statistical%20Decision%20Tree/default.htm