Designs for Research: The Xs and Os Framework

Designs for Research: The Xs and Os Framework Research Methods for Public Administrators Dr. Gail Johnson Dr. Johnson, www.ResearchDemysified.org

Steps in the Research Process Planning 1. Determining Your Questions 2. Identifying Your Measures and Measurement Strategy 3. Selecting a Research Design 4. Developing Your Data Collection Strategy 5. Identifying Your Analysis Strategy 6. Reviewing and Testing Your Plan Dr. Johnson, www.ResearchDemysified.org

Narrow Definition of Design • While sometimes the overall research plan is called a “design,” this discussion focuses on the narrow definition • The narrow definition focuses on 3 design elements Dr. Johnson, www.ResearchDemysified.org

Three Design Elements • When measures are taken • After • Before and After • Multiple times before and/or after • Whether there are comparison groups • Whether there is random assignment to comparison groups Dr. Johnson, www.ResearchDemysified.org

Three Broad Categories for Research Design • Experimental • Quasi-Experimental • Non-Experimental Dr. Johnson, www.ResearchDemysified.org

Experimental Design • The best design to use for cause-effect questions because it rules out most other possible explanations for the results obtained. • Random assignment assures that the two groups are comparable. Dr. Johnson, www.ResearchDemysified.org

The Xs and Os Framework R indicates Random assignment to the treatment group or the comparison group O is the Observation, that is, the measure for the dependent variable • Examples: earnings, weight, test scores, stock market trading, reported crime rate, kilowatt hours, reported discrimination, poverty rate, number of people unemployed, GDP, etc) • The researchers are looking to see if these measures change because of the treatment Dr. Johnson, www.ResearchDemysified.org

The Xs and Os Framework X is the treatment, which may be a: • A particular medication • A particular exercise regimen • A program (eg. Head Start Program or Troubled Asset Relief Program) • An independent variable (eg.economic news stories, sunspot activity, a change in daylight savings hours, etc) Dr. Johnson, www.ResearchDemysified.org

Example: Which approach works better in learning statistics: using computer software or calculating formulas by hand? Experimental Design: • Create Comparison Groups: • Group 1: X: computers to do formulas • Group 2: no computers • Randomly assign students into 2 groups • Observe: test scores before and after Dr. Johnson, www.ResearchDemysified.org

Using the Xs and Os Framework R O1 X O2 R O1 O2 R indicates Random assignment O is the Observation (test scores). Testing statistical knowledge before and after X is the treatment (in this case the use of computers) Dr. Johnson, www.ResearchDemysified.org

Experimental Design: Xs and Os Variation: No Pre-Measure • Sometimes it is not possible to have a pre-measure • For example: I am testing to see whether a welfare to work training program results in people getting jobs with above poverty wages. I can randomly assign people to the program or the control group, but I will not have a good measure for wages before they entered the program since they are all on welfare Dr. Johnson, www.ResearchDemysified.org

Experimental Design: Xs and Os Notation Variation: No Pre-Measure (note there are on observations before the treatment) R X O2 R O2 Dr. Johnson, www.ResearchDemysified.org

Quasi-Experimental Designs • Non-Equivalent Comparison Design • Like experimental except no random assignment • Use when you cannot control the process for deciding who gets the treatment. • Weak because there may be selection bias • But this is often more practical in public sector research Dr. Johnson, www.ResearchDemysified.org

Quasi-Experimental Design:Xs and Os O1 X O2 Treatment Group O1 O2 Control Group Key elements: • Pre and Post Measurement • Treatment to Test Group • Control Group (or comparison group without the treatment—but there is no random assignment). Random Dr. Johnson, www.ResearchDemysified.org

Quasi-Experimental Designs • Does spanking make a difference? • Can we randomly assign children to spanking and non-spanking parents? • No: We have to deal with the world as it exists • At best we can compare the behavior children from parents who spank with children whose parents don’t spank. Dr. Johnson, www.ResearchDemysified.org

Types of Quasi-Experimental Designs • Statistical Controls (sometimes called Correlation with Statistical Controls. • variations: Causal Comparative or Ex Post Facto design • Basically: statistical procedures are used to create comparisons group Dr. Johnson, www.ResearchDemysified.org

Ex-post Facto Design: Study of Child Abuse and Neglect A study funded by the Army Medical Research and Material Command, reported, “Duringthe 40 months covered by the study, 1,858 parents in 1,771 familiesof enlisted soldiers neglected or abused their children, ina total of 3,334 incidents involving 2,968 children. Of those,942 incidents occurred during deployments.”[1] [1] Aaron Levin, “Children of U.S. Army soldiers face increased risk of maltreatmentwhile a parent is deployed away from home,”Psychiatry News, September 7, 2007, Volume 42, Number 17, page 8; “Child Abuse, Neglect Rise Dramatically When Army Parents Deploy To Combat,” ScienceDaily, August 1, 2007, http://www.sciencedaily.com /releases/2007/07/070731175911.htm Dr. Johnson, www.ResearchDemysified.org

Ex-post Facto Design: Study of Child Abuse and Neglect • In this study, the researchers gathered data about children at a child care center serving military families and looked at the characteristics among those that were reported to have been abused or neglected as compared to those that were not. • They looked backwards to see if there were some differences that might explain why some children were abused and neglected. • They found that deployments were a factor • From a policy perspective, this suggests that families require more supports to handle the stresses associated with deployments. Dr. Johnson, www.ResearchDemysified.org

Correlational Design with Statistical Controls We cannot randomly assign people but can create comparison groups using statistical software and then compare outcomes. • Eg. We can compare people from different income groups to see if income is related to birth weights of their babies. • Eg. We can compare citizen policy preferences to see if there are differences based on age, race or gender. Dr. Johnson, www.ResearchDemysified.org

Does Head Start Make a Difference? • Select all 8th graders from two inner-city schools • Obtain school records which has information about whether the attended Head Start as well as other information • Statistical software can divide all the 8th graders into two groups: those who attended Head Start and those that didn’t • The 8th-grade reading scores can be compared Dr. Johnson, www.ResearchDemysified.org

Does Head Start Make a Difference? • If Head Start made a difference, then: • Their scores will be higher than those who did not • Their scores will be similar to the scores of other 8th graders in the school district • It might be possible to look at other factors, assuming the data is in their permanent records: • Education of parents, family income, other pre-school experiences Dr. Johnson, www.ResearchDemysified.org

More Quasi-experimental Designs • Longitudinal and Time Series • Measures taken over time • Time series: many measures • Longitudinal: a few measures • No clear dividing point when longitudinal becomes a time series • Example: Federal budget deficit over time Noted: O O O O O O O O O O O O O Dr. Johnson, www.ResearchDemysified.org

More Quasi-experimental Designs • Interrupted Time Series • Measures taken before and after an event • Time series: at least 15 measures before and after • Example: Number of smog warnings before and after air pollution legislation was passed in the city • Noted: O O O O X O O O O O Dr. Johnson, www.ResearchDemysified.org

More Quasi-experimental Designs • Multiple Time Series: Comparison • Example: number of smog days after city passes air pollution legislation as compared to a city of equal size and density that did not pass an air pollution law • Noted:O O O O O X O O O O O O O O O O O O O Dr. Johnson, www.ResearchDemysified.org

More Quasi-experimental Designs • Two ways to select: • Cross-sectional: slice of the population: a different group of people, roads, cities at each point in time • Drug survey of high school seniors • Panel: track the same people, roads, cities over time • National Longitudinal Survey of Youth: same group of people have been surveyed since 1979 Dr. Johnson, www.ResearchDemysified.org

Non-Experimental Designs • Sometimes researchers are just trying to take a picture at one point in time • They are not trying to answer a cause-effect/impact question • These designs are appropriate for answering descriptive and normative questions discussed earlier Dr. Johnson, www.ResearchDemysified.org

Non-Experimental Design One shot: X O Key elements: • No random assignment • No pre-measures • No comparison Weakest design for cause-effect questions!! Dr. Johnson, www.ResearchDemysified.org

Non-Experimental Design Variations: Before and After Design O X O Static Group Comparison X O O Dr. Johnson, www.ResearchDemysified.org

True Confessions • Immigration Reform and Control Act • Employers would be fined if they knowingly hired illegal workers • GAO was asked to determine whether this law caused a widespread pattern of discrimination against those who look or sound foreign. • Type of Question: Cause-Effect Dr. Johnson, www.ResearchDemysified.org

True Confessions • What design elements can be used? • Random Assignment? No. Congress does not randomly require some states to implement a law and some states not. • Comparison Groups: No. All states had to implement at the same time. • Before measure: No. The law was implemented before any measure could be taken. Dr. Johnson, www.ResearchDemysified.org

True Confessions • What design is left? • Implement the law (X) and measure discrimination (O) • A one-shot design • The weakest design to answer an impact question. • You play the hand you are dealt. Dr. Johnson, www.ResearchDemysified.org

Sometimes Experimental Designs are not Possible • Designs reflect the situation and an experimental design is not always possible or practical. • You can’t assign children to parents who spank and those who do not • It might be more practical to conduct a reading program in a specific school rather than randomly children across the school district into a reading program or not. Dr. Johnson, www.ResearchDemysified.org

Sometimes Experimental Designs are not Possible • In public administration, the uses of experimental designs are limited by other ethical and legal considerations: • You cannot require anyone to participate. • You cannot deny services or benefits to which people are entitled. • You cannot deny life-saving treatments to people in need. Dr. Johnson, www.ResearchDemysified.org

Sometimes Experimental Designs Are Not Possible Politics may play a role: mayors may object to their city being in the “control group” while other cities get money to implement a program. Dr. Johnson, www.ResearchDemysified.org

Design and Internal Validity • You may see changes after a program has been implemented, but those changes might be caused by something other than the program. • The intention of design is to ensure that you are not tricked into believing an explanation that is not true. • Design helps ensure internal validity. • Design eliminates other possible (or rival) explanations. Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity History: Due to a particular event that took place while data was being conducted. • Drug-related death just before post-test may explain “no drug” attitude, not the program. • Using a comparison group in the same environment will reduce this threat. • If a comparison group is not possible, ask: “what has happened to determine if there was some event that might effect the results?” Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Maturation: Changes based on aging, growth, natural increases in skills • Improved study skills because of maturity, not the program • This matters in studies where the behavior or attitude likely to affected by getting older or becoming more experienced • Using a comparison group will reduce this threat Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Testing: changes do to learning how to take the test. Risk in pre/post designs where they they “learned” how to do the test. • Using a comparison group would reduce this threat because both groups would have taken the pre and post tests. Any learning from the testing alone would be controlled. Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Instrumentation: Changes in data collection Pre/post and comparative designs are vulnerable • Example: Interviewer changes (race/gender) may get different results, especially on race/gender questions. • Example: Changing wording in questions or changing measures are a problem because different things have been measured. The results on not truly comparable. • Ask: are the the measures reliable? Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Regression to the Mean: Things tend to average out over time • A problem when a group is selected for treatment or a program is enacted because of an unusually high or low score. • The next set of scores are likely to change--to “regress to the mean”–regardless of treatment. • Using measures over time helps or a comparison time series identify trends and makes it easier to assess real change from just the appearance of change because of the regression to the mean effect. Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Selection: The group under study may be different in ways that effect the results. • School selected for a program is different from the schools that were not selected • A low income school may score different than a high income school • Volunteers may be different than those who chose not to participate. • “Did the program officials select the people most likely to succeed to make the program look successful?” Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Selection: The group under study may be different in ways that effect the results. • Random selection and assignment avoids this problem • But if random is not possible, collect data that might help examine differences (demographic data usually work) Dr. Johnson, www.ResearchDemysified.org

Threats to Internal Validity Attrition: different rates of dropping out may effect results. • “Problem” people may drop out, so results may look better based on those left behind. • Eg. Test scores may be higher because the failing students had dropped out. • Do what is possible to avoid attrition. If there is attrition, researchers should note as limitation to conclusions that can be drawn. Dr. Johnson, www.ResearchDemysified.org

Did the Poverty Program Fail? Dr. Johnson, www.ResearchDemysified.org

How to Decide? • Measurement: • How do you define the “poverty program”? • What components of the poverty program were specifically designed to reduce poverty? • How was poverty operationalized? • Does food account for 1/3 of our living expenses? • Design: No control group, no random assignment • At best, an interrupted time series design • We do not know what percent of people would be below the poverty line if the “poverty program” was not in place during any of the recessions between 1960 and 2000. Dr. Johnson, www.ResearchDemysified.org

External Validity • Does what happens in the lab under controlled settings likely to be the same as what happens outside of the lab? • Does what happens in this study reflect what occurs in other places where the program is also being conducted? • Programs may share the same name but be implemented differently. Dr. Johnson, www.ResearchDemysified.org

External Validity • Experimental designs are strong on internal validity • But are often weak on external validity • Relatively small and therefore rarely representative of the larger population • Much of what we know about social psychology comes from experiments involving college students—but they may or may not accurately reflect how other people behave. Dr. Johnson, www.ResearchDemysified.org

External Validity • It is easy for policymakers, program managers and advocates to get excited about an innovative program or policy and decide to implement it in their community. • The tough question is one of external validity: will this program or policy will work in their particular situation? • Public administrators have long known about the limits of “cookie cutter” or “one-size-fits all” approaches. Dr. Johnson, www.ResearchDemysified.org

No Perfect Design • One-shot designs: • Useful for descriptive and normative questions • Very weak for cause/effect questions: many threats • However, it is often used in public administration. We implement a program and then see if it worked. • Multiple one-shot designs begin to build a case Dr. Johnson, www.ResearchDemysified.org

No Perfect Design • Pre/post designs: • Useful in giving context for measuring change. • Threats: testing, instrumentation, regression to the mean, attrition, history, and maturation may be threats. • Threats tend to be context related • For example, regression to the mean is only a threat if a an unusually high or low score was used as the selection criteria. • For example, testing is only a threat if the researchers used a before and after test as part of their research design. Dr. Johnson, www.ResearchDemysified.org

Designs for Research: The Xs and Os Framework