1 / 71

Descriptive statistics

Descriptive statistics. 922. What do we need to run an experiment?. Hypothesis (Linguistic) Participants Task (stimuli = questions, responses = answers) Results Conclusions Key terms: stimulus design, response measure. Example. Show me the cat that bit the dog

misha
Download Presentation

Descriptive statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Descriptive statistics 922

  2. What do we need to run an experiment? Hypothesis (Linguistic) Participants Task (stimuli = questions, responses = answers) Results Conclusions Key terms: stimulus design, response measure

  3. Example • Show me the cat that bit the dog • Show me the cat that the dog bit Picture from: Friedmann &Novogrodsky (2001)

  4. Design Number of conditions Within subject / between subject How many items to each participant Order of items

  5. Measure Response • Variables • Scales • Analysis • Descriptive • Inferential

  6. Variables Any experimental category that has a value that can vary. Anything that is not constant and can change over time, or be different in different people is a variable Variables can take many forms Variables can be manipulated and observed

  7. Properties of Variables Continuous variable – along a continuum with equal intervals (e.g., age, height, weight, grade in a test) Ordinal variables – rating along a continuum with estimated intervals (e.g., evaluation) Discrete variables (categorical, nominal) – divide to categories (e.g., language, yes/no, correct/incorrect)

  8. Types of Variables • Independent variables – • Characteristics of the subject (Participant variable) • Conditions chosen by the experimenter • Dependent variables – what the experiment measures (e.g., degree of success) • Intervening variables – variables which are not measured or manipulated, but could influence the results (e.g., concentration, intelligence) Field, A. & G. Hole. 2003. How to design and Report Experiments. London: A Sage Publications Company

  9. Scales • Nominal • Ordinal • Interval • Ratio

  10. Scales • Nominal • Ordinal • Interval • Ratio • Two things with the same number are similar (same name)

  11. Scales • Nominal • Ordinal • Interval • Ratio • Four is more than three (but not the same as three from two)

  12. Scales • Nominal • Ordinal • Interval • Ratio • Four is more than two (but not twice)

  13. Scales • Nominal • Ordinal • Interval • Ratio • Four is more than three, same as three from two, and is twice two

  14. Which scale are the following variables rated on? • Height • Celsius degrees • TV channel number • Grades in an exam (1-100) • Psychological rating (anxiety on a scale of 1-10) • Time (13:00, 14:00) • Time (one hour, two hours, three hours) • Phone number • Rating places in a race

  15. Variables and Scales: summary Choose an appropriate task Measure responses Be aware of the variables and their properties Choose the mathematical operations appropriate for the scale

  16. Factorial design Tests all possible combinations, e.g., a 2x2 design – one participant variable and one independent variable with two conditions.

  17. Practical questions for offline tasks • How many subjects? At least 25 • How many categories? 2x2 • How many items? More subjects >> fewer items. • For 25 – 6 items per category • For 50 – 3 is enough • For case studies and within subject analysis at least 10.

  18. Simple Numerical computations

  19. Ratio • The relation between two nominal variables • V/N ratio: 60/80=3/4 • N/V ratio: 80/60=4/3

  20. Example • Goofy said that the Troll had to put two hoops on the pole to win. • Does the Troll win? • Musolino (2004)

  21. Ratio • Yes/no ratio: • 8/12=2/3

  22. Proportion • Relation between a group and its part (Verb/Word, Pronouns/Subject position). Ratio out of the total • Verb/Word proportion: 60/190=1/3=0.31

  23. Percentage (%) • Relative proportion out of a hundred • Verb percentage (out of all words): 100*(60/190) =31%

  24. Rate The relative frequency (for population out of a 1000) • 7% of children have SLI • >> 0.07 * 1000 = 70 • 70 children out of a 1000 have SLI

  25. Frequency • Count the number of times a score occurs. • How many times a value of a variable occurs?

  26. Example • Show 10 pictures, and check for number of “correct” response • Is every bunny eating a carrot? Roeper, Strauss and Zurer Pearson (2004)

  27. Frequency • Count the number of times a score occurs

  28. Frequency Raw score Frequency 2 4 2 Frequency=how many children got this score

  29. Frequency graph • Score on the test is the horizontal axis (X-axis) • Frequency is on the vertical axis (Y-axis)

  30. Percentile • The cumulative frequency - how many scores are below a particular point in the distribution Percentile = 100(Cumulative Frequency/Total N)

  31. Frequency polygon (the curve) The frequency polygon (the curve) is a picture of the data

  32. Types of distributions (Fig. 4.3 &4.4, pp. 113-116) Peak Tails A bell shaped curve - a symmetric distribution, a unimodal distribution (one midpoint, one peak), normal distribution

  33. Pointy distribution (Leptokutic) Flat distribution (Platykutic)

  34. In skewed distribution the tail is skewed in one direction: Positively skewed distribution - most scores are low, the tail is directed towards the high (positive) scores which skewed the distribution Negatively skewed distribution - most scores are high, the tail is directed towards the low (negative) scores which skewed the distribution

  35. Bimodal distribution - a double peaked curve

  36. Descriptive Statistics - Some definitions • Min (the lowest score) and Max (the highest score) • Range – the range of observed values. Range = Max-Min • But the range changes with the extreme scores (unstable but useful informal measure).

  37. Mode - most frequently obtained score • Mean (average) – average of a set of numbers • Median – the middle score of a group (when odd) or the average of the two middle scores (when even) In a bell curve (normal) distribution mode, mean and median will be the same

  38. Mode • Which grade is most frequent? • Highest in “frequency” column

  39. Mean (average) • Compute a sum of all grades • Divide by number of grades

  40. Mean (average)

  41. Median • Order all grades in a row according to value • The grade in “the middle” of the row is the median

  42. Median • We have a row of 30 grades: 50,60,60,60,60,70… • Half of 30 is 15 • The grade in the 15th position is the median

  43. Median • Slight complication: we have 15 grades on both sides of the median • Compute mean of the grades in the 15th and 16th positions

  44. Variability (Figure from Hatch & Farhady 1982, p.56) Questions: Are both curves the same? How? Are they different? How? We need to measure the accuracy of the mean.

  45. Coming attractions • How to draw valid statistical inferences? • We have to look at the relation between our sample and the population • Today we looked at where the ‘center’ of the data is – what is the big picture • Look at variance, how the data is distributed

  46. Deviation The distance between a score and the Mean (see Table 4.2, p. 125), how much a score deviates from the average Sum of squared errors (SS)

  47. Variance • Average error in the sample, average error in the population • Variance in the sample = SS/N 33.7143/7=4.8163 • Variancein the population = SS/(N-1) 33.7143/6=5.6191 • Why N-1? Degree of freedom (read box 4.5, page 129)

  48. Standard deviation (SD) • The average distance between a score and the Mean (square root of the Variance) SD= √5.6191 = 2.37 • What can SD tell us about the distribution (pointy distribution vs. flat distribution)?

  49. Standard Error (SE) • How well does the sample represent the population? • Different samples of the population might yield different means. The SE is the average of the SDs of the means of several samples. Large value - big difference, small value- small difference. SE = SD/√ N

More Related