Reliability and Validity of Dependent Measures

Reliability and Validity of Dependent Measures

Validity of Dependent Variables • Does it measure the concept? • Construct Validity: Does DV really capture what you want to measure (good operational definition?) • Or does it include mood, culture or gender bias, confusing wording, observational bias, etc.

Indicators of Construct Validity • Face Validity: Does it appear to be a good measure (do experts think so?) • Predictive Validity: Predict later behavior- GRE=grad school success? • Concurrent Validity: Are those known to diverge different in scores (Self Monitoring)

Indicators of Construct Validity • Convergent Validity: do other kinds of ratings agree? Similar responses to similar scales • Divergent validity: is it different from other constructs? (measures intell, not SES or gender bias) shy isn’t lonliness • Reactivity- knowing you are being studied changes behavior

Reliability of DV • Are results repeatable? • All measurement contains true score plus error of measurement • Not an issue of replication- same subjects=same scores

Types of Reliability • Inter-rater reliability- calculate r for observers or Cohen’s Kappa • Internal consistency- split half reliability Cronbach’s Alpha calculates ave of all possible corr. • Temporal consistency- test-retest reliability with SAME people • Restaurant example

Can a variable be reliable and not valid? • Valid and not reliable? • How do you know you have a good DV? • Mental Measurements Yearbook

Validity of Experimental Designs

Survey Design

Internal validity • Does the design test the hypothesis we want it to test? Did IV manipulation cause change in DV? Can we infer causality? • What if internal validity is low?

External validity • Does your study represent a broad population? • Caution with Discussion Section if weak • Random Sampling • Stratified Sampling • Block Randomization

Ecological validity Does study reflect the real world- do people really behave this way? Can you study anything without changing it?

Threats to Internal Validity: • In pre-post design: • Test participants • Administer IV • Post test for effect of IV • Compare pre vs. post results to look for effect of IV

History • World events may cause change in attitudes or behavior over time. • Tests of patriotism pre/post 9/11 • Views of President pre/post Katrina • Attitudes of adolescents pre/post Cobain suicide

Maturation • Individuals change over time as they mature. • Issue for studies of children, but also huge growth in freshman year- change of attidues and behavior.

Testing • The study you use may cause differences in behavior. • Similar to REACTIVITY, but for entire study not just DV. Parenting study for example

Instrumentation • Use of instrument may get better or worse with time • Observation studies • Testing skill/ interviewing

Regression toward the mean • Extreme scores do not tend to be repeatable- those who score very high or very low on a test will be closer to the average if tested again. • A big issue for any study where pretest is used to select subjects for post test.

Mortality • Those who drop out of your study may differ from those who choose to continue.

Placebo effect • If given any treatment, behavior will change, even if treatment was not meaningful. (fake drugs get some results)

How can we improve internal validity? • History • Maturation • Testing • Instrumentation • Regression toward the mean • Mortality • Placebo effect

In pre-post design: Test participants Administer IV Post test for effect of IV Compare pre vs. post results to look for effect of IV Two Group design Pretest (do you need to do this?) RANDOMIZED assignment to levels of IV Compare post test results of IV and Control groups Improved Design

Extraneous Variables • Any variable that you have not measured or controlled (RA) that may impact the results of your study

Demand Characteristics • Participants behave in ways demanded by the situation or experimental set-up. Behavior does not reflect actual beliefs or attitudes. • Issue of Ecological Validity

Subject Bias • Bias brought on by subjects beliefs (Overhead of mood and menstrual cycle)

Social desirability • Subjects want to do the “right thing” and try to guess what the experimenter wants, and do not behave naturally. • How to reduce Subject biases?

Experimenter Bias • Experimenters’ behavior and expectations can sway results of test. • How to reduce these biases?

Floor & Ceiling Effects • If measures are too easy or too difficult you will not see differences between groups. • Pilot test with similar subjects!

Order effects • When using within subjects designs, order of presentation can affect results in several ways. Practice effects: Subjects get better at task with successive trials Fatigue effects: Subjects get tired and do worse or lose interest Carryover effects: subjects experience in one condition impacts results of another condition- subject bias or anchoring and adjustment issues.

How to reduce order effects • Counterbalancing • Does not get rid of effects, it just makes them equal for all groups. Can do complete counterbalancing if small number of conditions. • Latin Square counterbalancing • A, B, skip, C, skip, D, etc. then fill back • A, B, N, C, N-1, D, N-2, E etc.

A Latin Square for 6 conditions

Pretest Vs. Pilot test • When do you use a pilot test? • When do you use a pre test?

Can a DV be reliable but not valid?

Experimental Validity • What to do if low Internal Validity? • What are impacts of low External Validity? • What if Ecological Validity is low?

Reliability and Validity of Dependent Measures

Reliability and Validity of Dependent Measures

Presentation Transcript

Reliability and Validity

Reliability and Validity

Reliability and Validity

Reliability and Validity

VALIDITY AND RELIABILITY

Reliability and Validity

Validity and Reliability

Validity and Reliability

Reliability and Validity

Validity and reliability

Validity and Reliability

Validity and Reliability

Reliability and Validity

Validity and Reliability

Reliability and Validity

Validity and Reliability

Reliability and Validity

Reliability and Validity

Validity and Reliability