550 likes | 693 Views
Making Social Work Count Lecture 9. An ESRC Curriculum Innovation and Researcher Development Initiative. Testing for statistical significance. Is there a real difference? . Learning outcomes. At the end of this session, you should be able to:
E N D
Making Social Work Count Lecture 9 • An ESRC Curriculum Innovation and Researcher Development Initiative
Testing for statistical significance Is there a real difference?
Learning outcomes • At the end of this session, you should be able to: • Define research question, research hypothesis, null hypothesis and statistically significant; • Discuss the basic requirements for testing the difference between two means; • Define and describe the difference between the alpha value and P value, and Type I and Type II errors; • Advanced Study: • Calculate the difference between the means (t-ratio) using example data through advanced study
Taking decision-making a step further • theorise and hypothesize • test • decide and predict • theorise and hypothesise • collect sample data • test • determine the likelihood the hypothesis is true
What kinds of research questions might we want to ask in social work?
Research questions • Do children exposed to domestic violence experience more mental ill health than children who have not been exposed? • Is smacking an effective behavioural management tool? • Are young offenders on non custodial sentences less likely to offend than young offenders on custodial sentences? • Is parenting capacity reduced when parents misuse drugs or alcohol? • Do children in kinship care have better outcomes than children with unrelated foster carers?
Research hypotheses • Children exposed to domestic violence experience more mental ill health than children who have not been exposed • Smacking is an effective behavioural management tool • Young offenders on non custodial sentences are less likely to offend than young offenders on custodial sentences • Parenting capacity is reduced when parents misuse drugs or alcohol • Children in kinship care have better outcomes than children with unrelated foster carers
Making statements for testing Research hypothesis Null hypothesis The opposite position of the hypothesis There is no relationship between two measured variables The particular intervention does not make a difference/has no effect Example: Smacking is not an effective behavioural management tool • A proposed explanation for a phenomenon that can be tested • There is a relationship between two measured variables • A particular intervention makes a difference/has an effect Example: Smacking is an effective behavioural management tool
Accepting the research hypothesis • Accepting the research hypothesis says that the difference between sample means is too large to be accounted for by sampling error, and, therefore, reflects differences within or between populations.
Research questions and hypotheses in Social Work • Does a particular intervention work? • Have negative symptoms reduced after the intervention? (i.e. depression; anxiety; numbers of hours of care) • Have positive attributes of service users increased after the intervention? (i.e. self-esteem; optimism; hope; less number of hours of care) • Are there differences in treatment outcomes based on gender, ethnicity, age?
Does singing have an effect on mental health? • Study looking at the effect of choral signing on mental health (N=218) • Research hypothesis – Participation in choral singing will result in changes in the participant’s emotional state. • Null hypothesis – Participation in choral singing will result in no changes in the participant’s emotional state. • Measured aspects of one’s emotional state before and after the choral singing intervention • Found that positive emotions increased (excited, alert, hopeful, relaxed, harmonic, to be present) and negative emotions decreased (pain, headache). Also found that participants felt more together and less alone.
Does singing have an effect on mental health? • Are these findings true or did the differences in emotional states just happen by chance? • How confident are we that there was a statistically significant difference in emotional state pre and post the choral singing?
Statistically Significant The basic requirements for testing the difference between two means • The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error): • t – test • Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance
Basic requirements in hypothesis testing/ answering research questions • Develop a research hypothesis to test or a research question to answer • Select or determine the main variable of interest to measure (data) • This could involve using a standardised measure (i.e. Strengths and Difficulties questionnaire for children; parenting capacity measure) or any other numerical value (i.e. number of units of alcohol or drugs used; number of weeks in foster care) • Select your sample(s), gather the data, and calculate the descriptive statistics of the sample(s) • Conduct a statistical test of difference between the means (or other descriptive statistics, such as percentages) (t-test; chi-square; ANOVA; Fisher’s exact test) • This involves pre-determining the alpha value • Interpret whether the results are statistically significant
An example of determining differences between hope amongst social work students
A “hopeful” example of how to test for differences • Develop a research question to answer or a research hypothesis to test: • Research Question - Is there a difference in level of hope between first year and final year social work students? • Research Hypothesis - First year social work students will have higher levels of hope than final year social work students. • Null Hypothesis – There is no difference in level of hope between first year and final year social work students.
A “hopeful” example of how to test for differences 2. Select an instrument or measure of hope: • The Trait Hope Scale (Snyder et al., 1991)
The Trait Hope Scale • Sums the scores of items: 1, 2, 4, 6, 8, 9, 10, and 12 (the other items are fillers and do not contribute to the overall score). • Score can range from 8 – 32 with higher scores equally higher levels of hope.
A “hopeful” example of how to test for differences • Select a random sample of students from the first year cohort and final year cohort (N = 30 for each cohort): • Administer the the Trait Hope Scale to the sample of first and final year students. • Calculate the mean scores for first and final year students.
Is there a difference in level of hope between first and final year students? • Calculate the mean scores between the two groups of students: • First Years = 27.7 • Final Years = 22.2 • There is a mean difference of 5.5 • Can we conclude that first years are actually more hopeful than final years? • Could the difference be a result of sampling error and, therefore, the difference is due to chance and chance alone?
Output #1: The mean hope score for the first and final year students
The distribution of hope scores for first and final year students Mean Mean
Standard Deviation: A measure of variability • The standard deviation (sd) reflects the typical deviation from the mean • Describes the variability of the scores • how close are the scores or how far apart are the scores? • The sd for the hope scores were as follows: • sd (first years ) = 3.706 • sd (final years) = 4.688
Standard Deviation: An example of great variability and no variability N=30; Min = 15; Max = 30; sd = 6.226 N=30; Min = 30; Max = 30; sd = .000
A “hopeful” example of how to test for differences 4. Conduct a statistical test of significance (t- test) between the first year hope score mean and the final year hope score mean. • Determine the level of significance or α (alpha value)
What level of significance? • Statistically significant difference – The differences observed in a sample reflect a real population difference and is not a result of sampling error. • In order to determine if something is statistically significant, you must establish a level of significance (represented by the Greek letter α (alpha)).
Probability (P) and alpha value (α) • α = the level of probability where the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence • We symoblise the probability as P < .05 • It is convention to use the α = .05 level of significance, but significance levels can be set up for any degree of probability • α = .01 level of significance; P < .01 • We reject the null hypothesis if the P value is less than the alpha value and otherwise retain it.
Output #2: The t-test for differences between two means (Independent samples test)
A “hopeful” example of how to test for differences 5. Interpret whether the results are statistically significant • Is there a statistically significant difference between the two means? • Can we reject the null hypothesis of no difference?
Output #2: The t-test for differences between two means (Independent samples test)
The results • ‘a t of 5.010 with 58 degrees of freedom indicates that the difference between first and final year students in their mean hope scores is statistically significant at the .001 level.’
99% Confident (1 chance out of 100 of a Type I error) Mean Mean
Statistically Significant Have we made the right decision: Type I and Type II errors • The researchers would have employed statistical tests to determine the extent to which they are confident that the results accurately reflect what occurs in the population (a real population difference) versus merely occurring by chance (or sampling error): • t – test • Based on the outcome of the test, the researchers can determine whether the results are statistically significant. That is, the outcome is unlikely to have occurred by chance
Type I and Type II errors • Setting a level of significance does not mean we will always be able to say with 100% confidence that we are correct in accepting or rejecting the null hypothesis. • There is a 5 in 100 chance (P = .05) with α = .05 level of significance of being wrong and 1 in 100 (P = .01) chance with α = .01 level of significance of being wrong.
Type I and Type II errors • Type I error (α) – Reject the null hypothesis (stating there IS a difference between means) when we should have accepted (stating there IS NOT a difference between means). • The more stringent our level of confidence, the less likely we will make a Type I error • Type II error (β)– Retaining the null hypothesis (stating there IS NOT a difference between means) when we should have accepted (stating there IS a difference between means) • Increase the size of samples
Type I and Type II errors DECISION Retain null Reject null Hypothesis hypothesis REAL ITY Null hypothesis is true Null hypothesis is false * Levin, J., & Fox, J.A. (2003). Elementary statistics in social research, (9th ed.). Boston: Allyn and Bacon.
Example 2: How do placements in kinship care compare with those in non-kin foster care? (Farmer, 2009)
1. Develop a research hypothesis (or question) to test (or answer) • Are there differences in parent-related adversities and child-related adversities of children placed with kin versus children placed with unrelated foster carers? (Farmer, 2009)
2. Select or determine the main variable of interest to measure (data) • Parent-related adversities (independent variable) – death of a parent; drugs misuse; mental health problems • Child-related adversities (independent variable) – neglected; sexual abuse; domestic violence • Placement of child (either with kin or with non-related foster carers)
3. Select your sample, gather the data and calculate descriptive statistics of the sample • List of 2240 children • A sample of 270 children were selected, just over half of whom (53%) were placed with family or friends and just under half (47%) with unrelated foster carers. • Reviewed case files, interviews using semi-structured interview format, standardized measures