220 likes | 238 Views
Experimental Research Methods in Language Learning. Chapter 9 Descriptive Statistics. Leading Questions. Do you think statistics is difficult to understand? Will it be difficult to learn? Why do you think so?
E N D
Experimental Research Methods in Language Learning Chapter 9 Descriptive Statistics
Leading Questions • Do you think statistics is difficult to understand? Will it be difficult to learn? Why do you think so? • What do you know is involved in performing a statistical analysis of experimental data? • Can you give an example of descriptive statistics? What does it tell us about language learners or research participants?
Checking and Organizing Data • Check whether all participants’ data are complete • Some participants may not have answered some questionnaire or test items. • Incomplete data are missing data and we need to make a decision on how to deal with them. • The best strategy to organize data is to assign an identity number (ID) to each participant.
Coding Data • The process of classifying or grouping data sets. • In some sense, coding data is closely related to organizing data so that we know how to statistically analyze them meaningfully. • Quantitative data are coded through scales (nominal, ordinal, interval and ratio). • How a test or measure is scored needs to be clearly stated/described. • Some qualitative data such as standardized think-aloud, performance assessment, or interview data can be coded for quantitative data analysis.
Entering Data • Once the data have been coded and numerical values have been assigned to each participant, we can key them into a statistical software program (e.g., SPSS, Excel). • In some cases, we can code data as missing. In other cases, we may have to remove the participants who have too many data missing.
Screening and Cleaning Data • Checking for accuracy in data entry accuracy. • Use of descriptive statistics to check for incorrectly-entered data. • Examine abnormal or impossible values in the data set (e.g., by looking at the minimum and maximum scores; using visual diagrams such as histograms and pie charts).
Computing Descriptive Statistics • Descriptive statistics provide basic information about the data (e.g., mean scores, minimum and maximum scores, standard deviations). • They can tell us whether we need to employ a parametric test for normally distributed data, or a non-parametric test for non-normal distributed data.
Estimating Data Reliability • To check that the data to be analyzed are reliable and valid. • The reliability of a research instrument is related to its consistency of measurement. • The validity of a research instrument refers to the fact that the instrument actually measures what is intended to be measured.
Reducing Data • To summarize the score for each test section (or sometimes for an overall test) for data entry and statistical analysis. • To compute a score for each sub-scale in a questionnaire (e.g., Likert-scale), i.e., using composites. • To perform a reliability analysis to see whether some items negatively affect the reliability of the instruments and if so, they can be removed. • To perform a confirmatory factor analysis
Computing Inferential Statistics • Inferential statistics are key statistical analyses that can yield answers to research questions. • Statistics are probabilistic. • Inferential statistics involves testing hypotheses, examining effect sizes and so on.
Addressing Research Questions • Use of inferential statistics, such as a t-test to answer a research question. • We think whether the statistical findings make sense or are meaningful, and consider how to best report and discuss them. • It is strategic to answering the research questions (informally) during data analysis because it helps facilitate the task of writing up the findings.
Descriptive Statistics • Descriptive statistics provide the basic characteristics of quantitative data (e.g., frequencies, average scores, most frequent scores). • Descriptive statistics provide measures of quantitative data (e.g., measures of central tendency, measures of variability, and measures of relative position).
Measures of Central Tendency • The Mean = simply the average of the data/scores • The Median = the value that divides the dataset exactly into two sets: half the scores are smaller than the median and half the scores ae larger. • The mode = the value that occurs most frequently in the data
Skewness and Kurtosis Statistics • Skewness statistics tell us the extent to which the data set is symmetrical. A data set is symmetrical if the skewness statistic is zero. • Kurtosis statistics shows the extent to which the shape of the distribution is pointy. A normally distributed data set has a kurtosis value of zero. • Ideally, skewness and kurtosis statistics should be within ± 1 for a data set to be considered normally distributed.
Measures of dispersion • Dispersion = the extent to which the data set is spread out. Measures of dispersion are interchangeably known as measures of variability. • The range = simply the difference between the highest and lowest scores in the data set. • The variance and standard deviation are commonly used measures of dispersion. • The standard deviation indicates how much, on average, the individual values differ from the mean (see Table 9.4) • The variance = the average of the squared standard deviation.
Measures of Relative Standing • Percentile rank = a statistic that tells us the percentage of scores in the distribution that are below a given score. • For example, a score with a 40 percentile rank has 40% of scores below it. It is quite simple to calculate a percentile rank as follows: rank of a score ÷ [total number of scores +1].
The z-scores • The z-scores allow us to see how an individual’s score can be placed in relation to the rest of the participants’ scores. • A z-score is basically a raw score that has been converted to a standard deviation format (see Figure 9.3 above). • The T-score is thus an extension of the z-score which allows us to avoid the use of negative values. The T-score is calculated as follows: [10 x z-score] + 50.
Discussion • What are purposes of descriptive statistics for experimental research? • Can you think of an example of quantitative data that are normally distributed? • What are common types of measures of tendency? Can you explain what they are and how they are calculated? • What is the most difficult concept of descriptive statistics we have discussed in this chapter?