Basic Statistical Analysis (Practical Approach)

Introduction to Clinical Research Basic Statistical Analysis(Practical Approach) PhD course Introduction to Clinical Research University of Copenhagen, Bispebjerg University Hospital March 2013

Ingredients: • Choice of method – what kind of data for the analysis • Normal Continouos Variable, 2 groups • Paired - T-test • Unpaired – F-test and T-test • Normal Continouos Variable, more groups • ANOVA • Dichotomous Variable • Chi-square test and fictive SD • Spreadsheet, functions and macros

Best determination of mean value?

Where was the mean value determined best? • The precision, also intuitively, is dependent on the relation between ”height and width”. • These relations are described in the t- as well as the Chi2- and F-distributions The t-distribution is used to evaluate the height-width relation when looking at the normal-distribution.

Paired data – evalution of medicine – NOT evaluation of population blood pressure (which is just ”noise”)

Fictive, illustrative curves with same SEM (no need for F-test), ”No overlap” (at 95% cut off), p-value <0.001

Unpaired data – it is not possible to subtract GL persons from KBH persons

Fictive curves with different SEM (F-test needed!), overlap (at 95% cut off), p-value >0.05

ANOVA-test (Analysis Of Variance) • One variable and more than two groups (one way-ANOVA) • Two or more varying parameters (two way-,Multivariate-ANOVA) • One way-ANOVA with just two groups is the same as T-Test, whereas several T-test gives the risk of mass significans problems

Dichotomous Data analysis and test (Chi-Square Test)

So now we've seen the different test end up with a p-value. So let's make sure we have the definition straight: Definition: P-value: The Probability of getting the observed results (or more extreme) although the null-hypothesis is true.

Often, with more than one result, evaluate of those results statistically are performed even though any results except the primary endpoint only carry hypothesis generating potential. With more results the is a risk of mass-significance. To avoid that a Bonferoni-correction is needed: The Bonferoni-correction is the simplest correction for mass-significance and it is very conservative. It is only valid for evaluating significance, i.e. preventing Type I errors. Power will be lost using Bonferoni. Bonferoni is only usable with a little number of results (<30).

Example why Bonferoni-correction is needed: P (for at least one significant result) = 1 – P (no significant result) ↕ P (for at least one significant result) = 1 – (1-0.05)20 ↕ P (for at least one significant result) = 0.64 How is Bonferoni-correction done: Simple: The general significance level (alpha), must be divided by number of results evaluated, i.e. alphai = alpha/i

Going more into detail with the statistical analysis, you should consider: Origin of noise and contribution to SD? Type of noise. What is it indicating?

Is noise additive and what about SD then? YES!

Now what if one of the noise components is systematic? Random noise Bias (systematic noise) Ideal

Remember: We can (try to) design the experiment away from bias We can (try to) be skilled away from random noise

Tips, tricks and drop outs: If at all possible, always show your original data e.g. in a graph or in a table. In that way you have been as honest as possible to your findings and readers, who then, when viewing your data can decide for herself if she wants to believe it or not. Many continouos data sets have a right skewed part of the distribution. This can often be ”corrected” to a normal distribution by log(data-points) – and then the T-, F- or ANOVA can be done. Remember, all the statistical calculations are based on your own subjective asssumptions – do you believe the results yourself?

http://en.wikipedia.org/wiki/List_of_statistical_packages http://office.microsoft.com/en-us/excel-help/load-the-analysis-toolpak-HP001127724.aspx

Basic Statistical Analysis (Practical Approach)