780 likes | 932 Views
Descriptive statistics. 922. What do we need to run an experiment?. Hypothesis (Linguistic) Participants Task (stimuli = questions, responses = answers) Results Conclusions Key terms: stimulus design, response measure. Example. Show me the cat that bit the dog
E N D
What do we need to run an experiment? Hypothesis (Linguistic) Participants Task (stimuli = questions, responses = answers) Results Conclusions Key terms: stimulus design, response measure
Example • Show me the cat that bit the dog • Show me the cat that the dog bit Picture from: Friedmann &Novogrodsky (2001)
Design Number of conditions Within subject / between subject How many items to each participant Order of items
Measure Response • Variables • Scales • Analysis • Descriptive • Inferential
Variables Any experimental category that has a value that can vary. Anything that is not constant and can change over time, or be different in different people is a variable Variables can take many forms Variables can be manipulated and observed
Properties of Variables Continuous variable – along a continuum with equal intervals (e.g., age, height, weight, grade in a test) Ordinal variables – rating along a continuum with estimated intervals (e.g., evaluation) Discrete variables (categorical, nominal) – divide to categories (e.g., language, yes/no, correct/incorrect)
Types of Variables • Independent variables – • Characteristics of the subject (Participant variable) • Conditions chosen by the experimenter • Dependent variables – what the experiment measures (e.g., degree of success) • Intervening variables – variables which are not measured or manipulated, but could influence the results (e.g., concentration, intelligence) Field, A. & G. Hole. 2003. How to design and Report Experiments. London: A Sage Publications Company
Scales • Nominal • Ordinal • Interval • Ratio
Scales • Nominal • Ordinal • Interval • Ratio • Two things with the same number are similar (same name)
Scales • Nominal • Ordinal • Interval • Ratio • Four is more than three (but not the same as three from two)
Scales • Nominal • Ordinal • Interval • Ratio • Four is more than two (but not twice)
Scales • Nominal • Ordinal • Interval • Ratio • Four is more than three, same as three from two, and is twice two
Which scale are the following variables rated on? • Height • Celsius degrees • TV channel number • Grades in an exam (1-100) • Psychological rating (anxiety on a scale of 1-10) • Time (13:00, 14:00) • Time (one hour, two hours, three hours) • Phone number • Rating places in a race
Variables and Scales: summary Choose an appropriate task Measure responses Be aware of the variables and their properties Choose the mathematical operations appropriate for the scale
Factorial design Tests all possible combinations, e.g., a 2x2 design – one participant variable and one independent variable with two conditions.
Practical questions for offline tasks • How many subjects? At least 25 • How many categories? 2x2 • How many items? More subjects >> fewer items. • For 25 – 6 items per category • For 50 – 3 is enough • For case studies and within subject analysis at least 10.
Ratio • The relation between two nominal variables • V/N ratio: 60/80=3/4 • N/V ratio: 80/60=4/3
Example • Goofy said that the Troll had to put two hoops on the pole to win. • Does the Troll win? • Musolino (2004)
Ratio • Yes/no ratio: • 8/12=2/3
Proportion • Relation between a group and its part (Verb/Word, Pronouns/Subject position). Ratio out of the total • Verb/Word proportion: 60/190=1/3=0.31
Percentage (%) • Relative proportion out of a hundred • Verb percentage (out of all words): 100*(60/190) =31%
Rate The relative frequency (for population out of a 1000) • 7% of children have SLI • >> 0.07 * 1000 = 70 • 70 children out of a 1000 have SLI
Frequency • Count the number of times a score occurs. • How many times a value of a variable occurs?
Example • Show 10 pictures, and check for number of “correct” response • Is every bunny eating a carrot? Roeper, Strauss and Zurer Pearson (2004)
Frequency • Count the number of times a score occurs
Frequency Raw score Frequency 2 4 2 Frequency=how many children got this score
Frequency graph • Score on the test is the horizontal axis (X-axis) • Frequency is on the vertical axis (Y-axis)
Percentile • The cumulative frequency - how many scores are below a particular point in the distribution Percentile = 100(Cumulative Frequency/Total N)
Frequency polygon (the curve) The frequency polygon (the curve) is a picture of the data
Types of distributions (Fig. 4.3 &4.4, pp. 113-116) Peak Tails A bell shaped curve - a symmetric distribution, a unimodal distribution (one midpoint, one peak), normal distribution
Pointy distribution (Leptokutic) Flat distribution (Platykutic)
In skewed distribution the tail is skewed in one direction: Positively skewed distribution - most scores are low, the tail is directed towards the high (positive) scores which skewed the distribution Negatively skewed distribution - most scores are high, the tail is directed towards the low (negative) scores which skewed the distribution
Descriptive Statistics - Some definitions • Min (the lowest score) and Max (the highest score) • Range – the range of observed values. Range = Max-Min • But the range changes with the extreme scores (unstable but useful informal measure).
Mode - most frequently obtained score • Mean (average) – average of a set of numbers • Median – the middle score of a group (when odd) or the average of the two middle scores (when even) In a bell curve (normal) distribution mode, mean and median will be the same
Mode • Which grade is most frequent? • Highest in “frequency” column
Mean (average) • Compute a sum of all grades • Divide by number of grades
Median • Order all grades in a row according to value • The grade in “the middle” of the row is the median
Median • We have a row of 30 grades: 50,60,60,60,60,70… • Half of 30 is 15 • The grade in the 15th position is the median
Median • Slight complication: we have 15 grades on both sides of the median • Compute mean of the grades in the 15th and 16th positions
Variability (Figure from Hatch & Farhady 1982, p.56) Questions: Are both curves the same? How? Are they different? How? We need to measure the accuracy of the mean.
Coming attractions • How to draw valid statistical inferences? • We have to look at the relation between our sample and the population • Today we looked at where the ‘center’ of the data is – what is the big picture • Look at variance, how the data is distributed
Deviation The distance between a score and the Mean (see Table 4.2, p. 125), how much a score deviates from the average Sum of squared errors (SS)
Variance • Average error in the sample, average error in the population • Variance in the sample = SS/N 33.7143/7=4.8163 • Variancein the population = SS/(N-1) 33.7143/6=5.6191 • Why N-1? Degree of freedom (read box 4.5, page 129)
Standard deviation (SD) • The average distance between a score and the Mean (square root of the Variance) SD= √5.6191 = 2.37 • What can SD tell us about the distribution (pointy distribution vs. flat distribution)?
Standard Error (SE) • How well does the sample represent the population? • Different samples of the population might yield different means. The SE is the average of the SDs of the means of several samples. Large value - big difference, small value- small difference. SE = SD/√ N