1 / 45

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice. VTS 02/02/2011. Audit - definition Research – definition Bias Blinding Confidence intervals Forest plot L’Abb é plot Hypothesis Null hypothesis Incidence Prevalence Normal distribution.

kostya
Download Presentation

Bread and butter statistics: RCGP Curriculum Statement 3.5: Evidence-Based Practice

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bread and butter statistics:RCGP Curriculum Statement 3.5: Evidence-Based Practice VTS 02/02/2011

  2. Audit - definition Research – definition Bias Blinding Confidence intervals Forest plot L’Abbé plot Hypothesis Null hypothesis Incidence Prevalence Normal distribution Possible topics for today • Parameter • Statistic • Variable • P-value • Number needed to treat • Number needed to harm • Statistical power • Sensitivity • Positive predictive value • Specificity • Reliability • Validity

  3. Useful Websites http://www.medicine.ox.ac.uk/bandolier http://www.cebm.net http://www.library.nhs.uk/Default.aspx http://cochrane.co.uk/en/clib.html http://dtb.bmj.com/ http://www.nice.org.uk/

  4. Topics for today - 1 Audit - definition Research – definition Bias Blinding Confidence intervals Forest Plot

  5. Topics for today - 2 L’Abbé plot Hypothesis Null hypothesis Incidence Prevalence Normal distribution

  6. Topics for today - 3 Parameter Statistic Variable P-value Number needed to treat Number needed to harm

  7. Topics for today - 4 Statistical power Sensitivity Positive predictive value Specificity Reliability Validity

  8. Audit – definition Clinical audit is a quality improvement process It seeks to improve patient care and outcomes by systematic review of care against explicit criteria and the implementation of change Aspects of the structure, processes, and outcomes of care are selected and systematically evaluated against explicit criteria Where indicated, changes are implemented at an individual, team or service level and further monitoring is used to confirm improvement in healthcare delivery from NICE

  9. The Audit cycle Identify the need for change Problems may be identified in 3 areas: Structure, Process, Outcome Set Criteria and Standards - what should be happening Collect data on performance Assess performance against criteria and standards Identify changes needed

  10. The Audit cycle

  11. Research - definition Research is an ORGANISED and SYSTEMATIC way of FINDING ANSWERS to QUESTIONS SYSTEMATIC - certain things are always done in research in order to get the most accurate results ORGANISED - there is a structure or method in doing research. It is a planned procedure, focused and limited to a specific scope FINDING ANSWERS is the aim of all research. Whether it is the answer to a hypothesis a question, research is successful when answers are found even if they are negative QUESTIONS are central to research. Research is focused on relevant, useful, and important questions. Without a question research has no purpose

  12. Bias Dictionary definition - 'a one-sided inclination of the mind'. It defines a systematic tendency of certain trial designs to produce results consistently better or worse than other designs In studies of the effects of health care bias can arise from: systematic differences in the groups compared (selection bias) the care that is provided, or exposure to other factors apart from the intervention of interest (performance bias) withdrawals or exclusions of people entered into the study (attrition bias) how outcomes are assessed (detection/observer bias) This use of bias does not necessarily imply any prejudice, such as the investigators' desire for particular results, which differs from the conventional use of the word meaning a partisan point of view

  13. Blinding Participants, investigators and/or assessors do not know which treatments participants are receiving. Lack of blinding is a potent source of bias, and open studies or single-blind studies have potential problems for interpreting results In a single blind study participants may be blind to their allocations, or those who are making measurements of interest, the assessors. In a double blind study, both participants and assessors are blind to the allocations To achieve a double-blind state, it is usual to use matching treatment and control treatments, e.g. active and placebo tablets can be made to look the same

  14. Blinding If treatments are radically different (e.g. tablets compared with injection) a double-dummy technique may be used where all patients receive both an injection and a tablet to maintain blinding Concealment of allocation - the process used to prevent foreknowledge of group assignment in a randomised controlled trial, which is distinct from blinding. This process should be impervious to any influence by the individual making the allocation. Adequate methods of allocation concealment include: centralized randomisation schemes; numbered or coded containers in which capsules from identical-looking, numbered bottles are administered sequentially; sequentially numbered opaque, sealed envelopes

  15. Confidence intervals Quantifies the uncertainty in measurement. Usually reported as 95% CI, = the range of values within which we can be 95% sure that the true value lies For example, for an NNT of 10 with a 95% CI of 5 and 15, there is 95% confidence that the true NNT value was between 5 and 15

  16. Confidence intervals Confidence intervals are preferable to p-values, as they tell us the range of possible effect sizes compatible with the data A confidence interval that includes the value of no difference indicates that the treatment under investigation is not significantly different from the control Confidence intervals aid interpretation of clinical trial data by putting upper and lower limits on the likely size of any true effect

  17. Confidence intervals Bias must be assessed before confidence intervals can be interpreted. Even very large samples and very narrow confidence intervals can mislead if they come from biased studies Non-significance does not mean no effect. Small studies may report non-significance even when there are important, real effects Statistical significance does not necessarily mean that the effect is real: by chance alone about one in 20 significant findings will be spurious Statistical significance does not necessarily mean clinically important - the size of the effect determines the importance

  18. Forest (meta-analysis) plot In a Forest plot, the results of component studies are shown as squares centred on the point estimate of the result of each study A horizontal line runs through the square to show its confidence interval usually, but not always, a 95% confidence interval The overall estimate from the meta-analysis and its confidence interval are put at the bottom, represented as a diamond. The centre of the diamond represents the pooled point estimate, and its horizontal tips represent the confidence interval Significance is achieved at the set level if the diamond is clear of the line of no effect

  19. Forest (meta-analysis) plot The plot allows readers to see the information from the individual studies which went into the meta-analysis at a glance It provides a simple visual representation of the amount of variation between the results of the studies, as well as an estimate of the overall result of all the studies together

  20. Meta-analysis of effect of beta blockers on mortality after myocardial infarction Lewis and Ellis, 1982

  21. In the modern format ~ • Back to top

  22. L'Abbé plot • A simple scatter plot which can yield a comprehensive qualitative view of the data • If the experimental treatment is better than the control the point will lie in the upper left of the plot, between the y axis and the line of equality • If experimental is no better than control then the point will fall on the line of equality), and if control is better than experimental then the point will be in the lower right of the plot, between the x axis and the line of equality

  23. L'Abbé plot • Visual inspection gives a quick and easy indication of the level of agreement among trials • L'Abbé plots are becoming widely used • They have several benefits: the simple visual presentation is easy to assimilate. They make us think about the reasons why there can be such wide variation in responses. They explain the need for placebo controls. They keep us sceptical about overly good or bad results • Figure: Trazodone for erectile dysfunction in psychogenic erectile dysfunction (dark symbols) and with physiological or mixed aetiology (light symbols

  24. Hypothesis “A tentative supposition with regard to an unknown state of affairs, the truth of which is … subject to investigation by any available method, either by logical deduction of consequences which may be checked against what is known, or by direct experimental investigation or discovery of facts not hitherto known and suggested by the hypothesis” “A proposition presented as a supposition rather than asserted. A hypothesis may be put forward for testing or for discussion, possibly as a prelude to acceptance or rejection” “Treating hypertension reduces myocardial infarction rate”. “Treating sore throat with penicillin reduces the rate of glomerulonephritis”. “Osteopathy is a good treatment for mechanical low back pain”. Etc.

  25. Null hypothesis “The statistical hypothesis that one variable (e.g. whether or not a study participant was allocated to receive an intervention) has no association with another variable or set of variables (e.g. whether or not a study participant died), or that two or more population distributions do not differ from one another”. In its simplest terms, the null hypothesis states that the results observed in a study are no different from what might have occurred as a result of chance

  26. Incidence The proportion of new cases of the target disorder in the population at risk during a specified time interval It is usual to define the disorder, and the population, and the time, and report the incidence as a rate Statement of the sort: Most developed countries with northern European age structure have an incidence of Parkinson’s disease of between 12 and 20 cases per 100 000 per year

  27. Prevalence This is a measure of the proportion of people in a population who have a disease at a point in time, or over some period of time E.g. there was a study of incidence and prevalence of MS in the Lothian and Border region of Scotland in the mid-1990s, with a population of about 864,000. Annual incidence was 12 per 100,000. If probable cases were included also, the rate rose to 18 per 100,000 Prevalence was determined by defining a prevalent case as any person with a diagnosis of multiple sclerosis alive and normally resident in the area on 15 March 1995. Probable as well as definite cases were included. There were 1613 residents with a diagnosis of MS, giving a crude prevalence rate of 187/100,000. The sex ratio was 2.5 and the mean age was 49 years.

  28. Normal distribution Normal distributions are a family of distributions that have the same general shape They are symmetrical with scores more concentrated in the middle than in the tails Normal distributions are sometimes described as bell shaped Also called Gaussian distribution Can be manipulated mathematically Similar to the pattern of distribution of naturally occurring phenomena All normal density curves satisfy the following property, often referred to as the Empirical Rule: 67.7% of the observations fall within 1 standard deviation of the mean 95% of the observations fall within 2 standard deviations of the mean 99.7% of the observations fall within 3 standard deviations of the mean I.e. for a normal distribution, almost all values lie within 3 standard deviations of the mean • Back to top

  29. Parameter A parameter is a number computed from a population. This contrasts with the definition of a statistic A parameter is a constant, unchanging value. There is no random variation in a parameter. If the size of the population is large (as is typically the case), then a parameter may be difficult to compute An example of a parameter would be: the average length of stay in the birth hospital for all infants born in the UK

  30. Statistic A statistic is a number computed from a sample. This contrasts this with the definition of a parameter If a statistic is computed from a random sample (typically the case), then it has random variation or sampling error An example of a statistic would be: the average length of stay in the birth hospital for a random sample of 387 infants born in BHRUT

  31. Variable A measurement that can vary within a study, e.g. the age of participants Variability is present when differences can be seen between different people or within the same person over time, with respect to any characteristic or feature that can be assessed or measured

  32. P-value The probability (ranging from zero to one) that the results observed in a study could have occurred by chance Calculated using a statistical test Convention is to accept a p-value of 0.05 (5%) or below as being statistically significant. Equivalent to a chance of random results of 1 in 20, which is not very unlikely. No solid mathematical basis, it was just chosen many years ago When many comparisons are being made, statistical significance can occur just by chance. A more stringent rule is to use a p-value of 0.01 (1 in 100) or below as statistically significant

  33. Number needed to treat (NNT) The inverse of the absolute risk reduction, the number of patients that need to be treated for one to benefit compared with a control The ideal NNT is 1, where everyone improves with treatment and no-one does with control. The higher the NNT, the less effective is the treatment The value of an NNT is not just numeric - NNTs of 2-5 are indicative of effective therapies, like analgesics for acute pain NNTs of about 1 might be seen in treating sensitive bacterial infections with antibiotics, while an NNT of 40 or more might be useful e.g. when using aspirin after a heart attack

  34. Calculating NNT NNT = 1/ARRARR = (CER – EER) where CER = control group event rate and EER = experimental group event rate Sample Calculation: The results of the Diabetes Control and Complications Trial into the effect of intensive diabetes therapy on the development and progression of neuropathy indicated that neuropathy occurred in 9.6% of patients randomised to usual care and 2.8% of patients randomised to intensive therapy. NNT with intensive diabetes therapy to prevent one additional occurrence of neuropathy can be determined by calculating the absolute risk reduction as follows: ARR = (CER – EER) = (9.6% - 2.8%) = 6.8%NNT = 1/ARR = 1/0.068 = 14.7 or 15 Therefore need to treat 15 diabetic patients with intensive therapy to prevent one from developing neuropathy

  35. Number needed to treat Response to antibiotics of women with symptoms of UTI but negative dipstick urine test results: double blind RCT. Richards et al, BMJ 2005;331:143-6. To reduce duration of symptoms by 2 days? 4 Antibiotic prescribing in GP and hospital admissions for peritonsillar abscess, mastoiditis and rheumatic fever in children: time trend analysis. Sharland et al, BMJ 2005, 331, 328-9. To prevent one case of mastoiditis? At least 2500 Trigeminal neuralgia Rx anticonvulsants. To obtain 50% pain relief? 2.5

  36. Number needed to treat Arthritis Rx glucosamine for 3-8/52 cf. placebo. To improve symptoms? 5 So why not prescribe it? http://www.nice.org.uk/nicemedia/pdf/CG59NICEguideline.pdf MRC trial of treatment of mild HT: principal results. 17,354 individuals aged 36-64 years with diastolic 90-109 mmHg Rx benzoflurazide and propranolol for 5.5 years cf. placebo. BMJ 1985 291: 97-104. Primary prevention of one stroke at one year? 850 Benign prostatic hypertrophy Rx finasteride for 2 years vs placebo. To prevent one operation? 39

  37. Number needed to harm (NNH) This is calculated in the same way as for NNT, but is used to describe adverse events For NNH, large numbers are good, because they mean that adverse events are rare. Small values for NNH are bad, because they mean adverse events are common

  38. Number needed to harm An example of how NNH can be calculated with NNT is that of inhaled corticosteroids used for asthma (Powell & Gibson. Inhaled corticosteroid doses in asthma: an evidence-based approach. Medical Journal of Australia 2003 178: 223-225). At low daily doses of 100 or 200 μg/day, neither dysphonia nor oral candidiasis was much of a problem, affecting about an additional one person per hundred treated, with NNTs of about 100 At daily doses of 500 μg and above, numbers needed to harm (NNH) fell to levels of about 20 or below, indicating that for every 100 patients treated with these doses, about an additional 5 would experience dysphonia and 5 would experience oral candidiasis • Back to top

  39. Statistical power The ability of a study to demonstrate an association or causal relationship between two variables, given that an association exists For example, 80% power in a clinical trial means that the study has a 80% chance of ending up with a p value of less than 5% in a statistical test (i.e. a statistically significant treatment effect) if there really was an important difference (e.g. 10% versus 5% mortality) between treatments If the statistical power of a study is low, the study results will be questionable e.g. the study might have been too small to detect any differences

  40. Statistical power Factors influencing power in a statistical test include: What kind of statistical test is being performed. Some statistical tests are inherently more powerful than others Sample size. In general, the larger the sample size, the larger the power. However, generally increasing sample size involves costs in time, money, and effort. It is therefore important to make sample size "large enough," but not wastefully large. The size of experimental effects. The level of error in experimental measurements. By convention, 80% is an acceptable level of power

  41. Sensitivity Proportion of people with the target disorder who have a positive test/symptom/sign. Used to assist in assessing and selecting a diagnostic test/symptom/sign A seNsitive test keeps false-Negatives down – 100% sensitive means all with positive tests have the condition SnNout: When a sign/test/symptom has a high Sensitivity a Negative result rules out the diagnosis. For example, the sensitivity of a history of ankle swelling for diagnosing ascites is 93%; if then a person does not have a history of ankle swelling, it is highly unlikely that they have ascites

  42. Specificity Proportion of people without the target disorder who have a negative test. It is used to assist in assessing and selecting a diagnostic test/symptom/sign A sPecific test keeps false-Positives down – 100% specific means all with negative tests do not have the condition SpPin: When a sign/test/symptom has a high Specificity a Positive result rules in the diagnosis. For example , the specificity of a fluid wave for diagnosing ascites is 92%; therefore if a person does have a fluid wave, it rules in the diagnosis of ascites

  43. Specificity and Sensitivity are closely related to the measures of: Positive Predictive Value: The proportion of people with a positive test who have the target disorder; and Negative Predictive Value: The proportion of people with a negative test who do not have the target disorder. Calculations from the table: sensitivity = a/(a+c)specificity = d/(b+d) positive predictive value = a/(a+b)negative predictive value = d/(c+d) Positive predictive value

  44. Reliability Reproducibility Stability over time and place Ease of replication Minimisation of observer variation Confirmation of results

  45. Validity This term is a difficult concept in clinical trials, but refers to a trial being able to measure what it sets out to measure A trial that set out to measure the analgesic effect of a procedure might be in trouble if patients had no pain Or in a condition where treatment is life-long, evaluating an intervention for 10 minutes is inappropriate • Back to top

More Related