100 likes | 247 Views
Data exploration and descriptive statistical analysis. Marek Majdan. Data Types. Categorical - nominal (sex), ordinal (education) - a limited number of values Numerical (continuous) - unlimited number of values - age, height, weight. Continuous data.
E N D
Data exploration and descriptive statistical analysis Marek Majdan Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Data Types • Categorical - nominal (sex), ordinal (education) - a limited number of values • Numerical (continuous) - unlimited number of values - age, height, weight Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Continuous data • Main descriptive statistical methods: • Normality of distribution (can the data be considered to be normally distributed?) • measures of central tendency and variance (where is the center of the values of our data and how are the rest of the values distributed around this center?) Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Normality of distributionRule of 68-95-99.7% Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Normality of distribution Statistical test: • Shapiro-Wilk test • Kurtosis and skewness Graphical display: • Histogram with distribution density curve • Q-Q plots • Boxplots Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Histogram with distribution density curve Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Q-Q plot Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Boxplot Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Normality of distribution • Parametric vs. non-parametric statistical procedures • Choice of test is depending on the distribution of the data Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com
Measures of central tendency and variance (dispersion) • Mean – Standard deviation • Median – Interquartile range • Mode • Range Training in essential biostatistics for Public Health Professionals in BiH, Marek Majdan, PhD; marekmajdan@gmail.com