440 likes | 682 Views
Basic Statistics and Correlational Research. by Dr. Daniel Churchill. What is statistics?. Statistics is a body of mathematical techniques or processes for gathering, organizing, analyzing, and interpreting numerical data. Basic Concepts.
E N D
Basic Statistics and Correlational Research by Dr. Daniel Churchill
What is statistics? • Statistics is a body of mathematical techniques or processes for gathering, organizing, analyzing, and interpreting numerical data.
Basic Concepts • Measurement – assigning a number of observation based on certain rules • Variable – a measured characteristic (e.g., age, grade level, test score, height, gender) • A constant – a measure that has only one value • Continuous variable – can have a wide range of values (e.g., height) • Discrete variables – have a finite number of distinct values between any two given points (age between 30-50)
Basic Concepts • Independent variables -- purported causes • Dependent variables -- purported effects • Two instructional strategies, co-operative groups and traditional lectures, were used during a three week social studies unit. Students’ exam scores were analyzed for differences between the groups. • The independent variable is the instructional approach (of which there are two levels) • The dependent variable is the students’ achievement Obj. 2.3
Basic Concepts • A population – entire group of elements that have at least one characteristics in common • A sample – a small group of observations selected from the total population • A parameter – a measure of a characteristics of an entire population • A statistic - a measure of a characteristics of a sample • Statistics – a method
Basic Concepts • Descriptive statistics – classify, organize, and summarize numerical data about a particular group of observations (e.g., a number of students in HK, the mean maths grade, ethnic make-up of students) • Inferential statistics – involve selecting a sample from a defined population and studying it. • These two statistics are not mutually exclusive
Probability and Level of Significance • Studies yield statistical results which are used to decide whether to retain or reject the null hypothesis • The decision is made in term of probability, not certainty • Once we obtain sample statistic, we compare the obtained value to the appropriate critical value (from tables) • Mostly, the probability level of 5% (p of .05) is considered statistically significant
Data Collection • Measurement scales • Nominal – categories • Gender, ethnicity, etc. • Ordinal – ordered categories • Rank in class, order of finish, etc. • Interval – equal intervals • Test scores, attitude scores, etc. • Ratio – absolute zero • Time, height, weight, etc. Obj. 2.1
Watch videos from Learner.orghttp://learner.org/resources/series158.html • Watch Video 5. Variation About the Mean
Statistical measures • Measures of central tendency or averages • Mean • Median -- a point in an array, above & below which one-half of the scores fall • Mode -- the score that occurs most frequently in a distribution
Organizing Data Source:http://www.learnactivity.com/lo/
Example Here is a set of maths test scores (raw scores) for a class of 31 students 37, 58, 74, 54, 67, 78, 48,42, 61, 42, 57, 61, 45, 63,52, 65,39, 59, 51, 63, 48,58, 73, 56, 69, 56, 72, 54,66, 72, 63
37 , 58 , 74, 54 , 67 , 78, 48 , 42 , 61 , 42 , 57 , 61 , 45 , 63 , 52 , 65 , 39 , 59 , 51 , 63 , 48 , 58 , 73, 56 , 69 , 56 , 72, 54 , 66 , 72, 63 Organizing measurements
37 , 58 , 74, 54 , 67 , 78, 48 , 42 , 61 , 42 , 57 , 61 , 45 , 63 , 52 , 65 , 39 , 59 , 51 , 63 , 48 , 58 , 73, 56 , 69 , 56 , 72, 54 , 66 , 72, 63 Organizing measurements –frequency tables
37 , 58 , 74, 54 , 67 , 78, 48 , 42 , 61 , 42 , 57 , 61 , 45 , 63 , 52 , 65 , 39 , 59 , 51 , 63 , 48 , 58 , 73, 56 , 69 , 56 , 72, 54 , 66 , 72, 63 7 6 5 4 Frequency 3 2 1 0 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80 Test Score Organizing measurements – Histogram
37 , 58 , 74, 54 , 67 , 78, 48 , 42 , 61 , 42 , 57 , 61 , 45 , 63 , 52 , 65 , 39 , 59 , 51 , 63 , 48 , 58 , 73, 56 , 69 , 56 , 72, 54 , 66 , 72, 63 37 + 58 + 74 + … + 72 + 63 = 58 X = 31 Organizing measurements – Mean
37 37 , , 58 58 , 74, , 74, 54 54 , , 67 67 , 78, , 78, 48 48 , , 42 42 , , 61 61 , , 42 42 , , 57 57 , , 61 61 , , 45 45 , , 63 63 , , 52 52 , , 65 65 , , 39 39 , , 59 59 , , 51 51 , , 63 63 , , 48 48 , , 58 58 , 73, , 73, 56 56 , , 69 69 , , 56 56 , 72, , 72, 54 54 , , 66 66 , 72, , 72, 63 63 Organizing measurements – Mode and Median Mode -- the score that occurs most frequently in a distribution 63 Median -- a point in an array, above & below which one-half of the scores fall 59
37 , 58 , 74, 54 , 67 , 78, 48 , 42 , 61 , 42 , 57 , 61 , 45 , 63 , 52 , 65 , 39 , 59 , 51 , 63 , 48 , 58 , 73, 56 , 69 , 56 , 72, 54 , 66 , 72, 63 7 6 5 4 Frequency 3 2 1 0 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-80 Test Score Organizing measurements – Histogram Median Mode Mean
Statistical measures • Measures of spread or dispersion • Range -- the difference between the highest and the lowest scores plus one • Standard deviation– average distance from the mean (also see calculator) • Variance – squared standard deviation • Z-score -- a number of standard deviations from the mean Z=(score-mean)/SD
(X-X)2 Variance = S2= n Standard Deviation = S = S2 Basic Formulas for Sample
Normal Distribution Source: http://en.wikipedia.org/wiki/Normal_distribution
Source: http://noppa5.pc.helsinki.fi/koe/flash/histo/histograme.html
X-X z score = z = S 50-45 68-56 zEnglish= =+1 zMaths = = +2 Z-Score Example Example, compare a student’s performance on Maths and English tests if the student’s scores, class means and standard deviations for the classes are known 5 6
Z-Score Example zEnglish zMaths
Correlational Studies • Attempts to describe the predictive relationships between or among variables • The predictor variable is the variable from which the researcher is predicting • The criterion variable is the variable to which the researcher is predicting Objectives 10.1 & 10.2
Relationship Studies • General purpose • Gain insight into variables that are related to other variables relevant to educators • Achievement • Self-esteem • Self-concept • Two specific purposes • Suggest subsequent interest in establishing cause and effect between variables found to be related • Control for variables related to the dependent variable in experimental studies Objectives 5.1 & 5.2
Correlation Coefficients • The general rule • +.95 is a strong positive correlation • +.50 is a moderate positive correlation • +.20 is a low positive correlation (small correlation) • -.26 is a low negative correlation • -.49 is a moderate negative correlation • -.95 is a strong negative correlation • Predictions • Between .60 and .70 are adequate for group predictions • Above .80 is adequate for individual predictions Objective 3.3 & 3.5
Conducting a Prediction Study • Identify a set of variables • Limit to those variables logically related to the criterion • Identify a population and select a sample • Identify appropriate instruments for measuring each variable • Ensure appropriate levels of validity and reliability • Collect data for each instrument from each subject • Typically data is collected at different points in time • Compute the results • Regression coefficient • Regression equation
Hypotheses for Correlation H0: r= 0 HA: r 0
Collecting Measurement • Instrument – a tool used to collect data • Test – a formal, systematic procedure for gathering information • Assessment – the general process of collecting, synthesizing, and interpreting information Obj. 3.1 & 3.2
The Process • Participant and instrument selection • Minimum of 30 subjects • Instruments must be valid and reliable • Higher validity and reliability requires smaller samples • Lower validity and reliability requires larger samples • Design and procedures • Collect data on two or more variables for each subject • Data analysis • Compute the appropriate correlation coefficient Objectives 2.2 & 2.3
Selection of a Test • Sources of test information, e.g.,: • Mental Measurement Yearbooks (MMY) • Buros Institute • ETS Test Collection • ETS Test Collection
Types of Correlation Coefficients • The type of correlation coefficient depends on the measurement level of the variables • Pearson r - continuous predictor and criterion variables • Math attitude and math achievement • Spearman rho – ranked or ordinal predictor and criterion variables • Rank in class and rank on a final exam • Phi coefficient – dichotomous predictor and criterion variables • Gender and pass/fail status on a high stakes test Objectives 7.1, 7.2, & 7.3
zxzy r = N Calculating Pearson Correlation Coefficient Z-score formula Raw score formula NXY-( X)(Y) r = (NX2-(X)2) (NY2-(Y)2)
Just for information Critical Values of the Pearson Product-Moment Correlation Coefficient: • First you determine degrees of freedom (df). For a correlation study, the degrees of freedom is 2 less than the number of subjects. Use the critical value table to find the intersection of alpha .05 (see columns) and 25 degrees of freedom (see rows). The value found at the intersection (.381) is the minimum correlation coefficient needed to confidently state 95 times out of a hundred that the relationship you found with your subjects exists in the population from which they were drawn. • If the absolute value of your correlation coefficient is above .381, you reject your null hypothesis (there is no relationship) and accept the alternative hypothesis: e.g., there is a statistically significant relationship between arm span and height, r (25) = .87, p < .05. • If the absolute value of your correlation coefficient were less than .381, you would fail to reject your null hypotheses: There is not a statistically significant relationship between arm span and height, r (25) = .12, p > .05. Source:http://www.gifted.uconn.edu/siegle/research/Correlation/alphaleve.htm
Prediction and Regression The position of the line is determined by “b” or the slope (the angle), and “a” of the interceptor (the point where the line intersects with Y-axis). Y= bX + a Source:http://noppa5.pc.helsinki.fi/koe/corr/index.html
Other Correlation Analyses • Multiple Regression • Two or more variables are used to predict one criterion variable • Cannonical correlation • An extension of multiple regression in which more than one predictor variable and more than one criterion variable are used • Factor analysis • A correlational analysis used to take a large number of variables and group them into a smaller number of clusters of similar variables called factors
References • Gay, L. R., Mills, G. E., & Airasian, P. (2006). Educational Research: Competencies for Analysis and Applications. Upper Saddle River, N.J. : Pearson/Merrill Prentice Hall. • Ravid, R. (2000). Practical statistics for educators. (2nd ed). New York, NY.: University Press of America, Inc.