290 likes | 452 Views
Descriptive Statistics. Two Branches of Stats. Descriptive Statistics describe the data collected Inferential Statistics draw inferences about the population from which the sample was drawn. Choosing a Statistic.
E N D
Two Branches of Stats • Descriptive Statistics • describe the data collected • Inferential Statistics • draw inferences about the population from which the sample was drawn
Choosing a Statistic • Deciding on the appropriate statistical test requires understanding the level of measurement and the type of variable. • categorical(discrete) vs. continuous • nominal, ordinal, interval and ratio
Conventions: • I will try and use Latin letters to represent sample statistics and Greek letters to represent population parameters • Latin (a, b, c, d, etc.) • Greek (α, β, γ, δ, ε, etc.)
Descriptive Statistics • Describing the data you’ve collected • Univariate single variable
Descriptive Statistics • Frequency distributions (categorical) • count
Relative frequency (percentage) distributions • valid percent • total percent
Other ways of describing the distribution • Measures of Central tendency • 1. Mean -sometime called the first moment • 2. Median – When the data is ordered largest to smallest it is the middles number if there are an odd number, and the mean of the middle two if there are an even number. The 50th percentile • 3. Mode – the most frequently occurring
Measures of Dispersion • Range – highest – lowest value • Variance - sometimes called the second moment
Skewness • A measure of the asymmetry of a distribution. The normal distribution is symmetric, and has a skewness value of zero. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a rough guide, a skewness value more than twice it's standard error is taken to indicate a departure from symmetry.
Kurtosis • A measure of the extent to which observations cluster around a central point. For a normal distribution, the value of the kurtosis statistic is 0. Positive kurtosis indicates that the observations cluster more and have longer tails than those in the normal distribution and negative kurtosis indicates the observations cluster less and have shorter tails.
Graphical Representation of Single Variables • Categorical • Bar Chart • Pie Chart
Continuous • Histogram • Line Chart • Box and Whiskers
Data Visualization • Much can be done to display data.
Bivariate Descriptive statistics • 2 variables • 3 possible combinations • cat/cat; • cat/cont; • cont/cont • Independent vs dependent.
Categorical/Categorical • Crosstabulations (2 way frequency tables, Crosstabs, Bivariate distributions)
Categorical/Continuous • Any statistic that applied to cont. variables done for each category • Mean, median, mode. • Variance, Std dev, skewness, kurtosis
Continuous/Continuous • Simple Correlation coefficient (Pearson’s product-moment correlation coefficient, Covariance) • this ranges from +1 to -1
Graphical Representations • Bar Charts pie charts etc. • histogram, box plots • scatter plots