1 / 63

Statistics

Statistics. Statistics. Branch of Mathematics that deals with the collection and analysis of data Descriptive Statistics: used to analyze and describe data

palti
Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics

  2. Statistics • Branch of Mathematics that deals with the collection and analysis of data • Descriptive Statistics: used to analyze and describe data • Inferential Statistics: used to use the information to make statements regarding the relationships between variables or the expectations about future events.

  3. Measures of Central Tendency

  4. Measures of Central Tendency • Arithmetic Mean • Median • Mode • Geometric Mean

  5. Arithmetic Mean • Other names • Average • Mean

  6. Arithmetic Mean • The calculation is identical, just the notation varies slightly

  7. Summation Notation • Notice that the first form uses less vertical space on the page • This makes accountants very happy • The first can also be easier to fit into a line of text

  8. Example • Ten second year BBA students wrote the CSC exam last month • Their scores were: 71, 72, 88, 69, 77, 63, 91, 81, 83, 75

  9. Calculating the Mean • Arithmetic mean • sum the observations and divide by the number of observations • Example: 5%, 7%, -2%, 12%, 8%

  10. Problem with the Arithmetic Mean • Arithmetic mean is incorrect for variables that are related multiplicatively, like rates of growth, rates of return and rates of change • $1,000 at 6% for 5 years should be $1,338.23

  11. Geometric Mean • The Geometric Mean should be used for rates of change, like rates of return

  12. Geometric Mean • The Geometric Mean should be used for rates of change, like rates of return Means: The product of these factors from 1 to N

  13. Geometric vs. Arithmetic Mean • The more variable the underlying data, the greater the error using the Arithmetic mean • The Geometric Mean is often easier to calculate: • Stock prices: 1992: $20; 1999: $40, R = 10.41%

  14. Geometric vs. Arithmetic Mean • For analysis of past performance, use the Geometric mean • The past returns have averaged 5.898% • To use the past returns to estimate the future expected return, use the Arithmetic mean • The expected return is 6%

  15. Median and Mode • Median: Midpoint • If odd number of observations: Middle observation • If even number of observations: Average of middle 2 observations • Mode: Most frequent

  16. Example • Our CSC mark data was (sorted): 63, 69, 71, 72, 75, 77, 81, 83, 88, 91 • The median is 76 • There is no mode

  17. Example • The Deviation is the difference between each observation and the mean • The sign indicates whether the observation is above (+) or below (-) the mean

  18. Example • The average deviation is always zero • If it isn’t, you must have made a mistake!

  19. Measures of Dispersion

  20. Measures of Dispersion • So far, we have look at measures of central tendency • What about measuring the tendency of the data to vary from these centre?

  21. Measures of Dispersion • Range • Highest - Lowest • Variance • Standard Deviation

  22. Example • The range is 91-63=28 • The range can be extremely sensitive to outlier observations • Suppose one of these students had a very bad day and scored 8. • The range would now be 91-8=83

  23. Mean Absolute Deviation • The Mean Absolute Deviation is a measure of average dispersion that is not used very much • It has some undesirable mathematical properties beyond the level of this course

  24. Mean Squared Deviation • The Mean Squared Deviation is very commonly used • The MSD in this example is 694/10=69.4 • The more common name of the MSD is the VARIANCE

  25. Variance • Variance measures the amount of dispersion from the mean. • For Populations: For Samples:

  26. Standard Deviation • Standard Deviation measures the amount of dispersion from the mean. • For Populations: For Samples:

  27. Standard Deviation Example • Using the previous example • The data is sample data

  28. Interpreting the Std. Dev. • You have heard of the Bell Shaped or Normal Distribution • The properties of the Normal Distribution are well known and give us the EMPIRICAL RULE

  29. Normal Distribution

  30. Empirical Rule For approximately Normally Distributed data: • Within 1s of the mean: approx.. 2/3s • Within 2s of the mean: approx. 95% (19/20) • Within 3s of the mean: virtually all

  31. Quartiles, Percentiles, etc. • The Median splits the data in half • Quartiles split the data into quarters • Deciles split the data into tenths • Percentiles split the data into one-hundredths

  32. Rank Measures • “That was a top-half performance” • “WTG Special fund has been a top quartile performer for the past 5 years” • “Our programme accepts only students proven to be top decile performers” • “I was in the 92nd percentile on the GMAT”

  33. Using Excel • Full Descriptive Statistics • Tools • Data Analysis • Descriptive Statistics

  34. Measures of Association

  35. Bivariate Statistics • So far, we have been dealing with statistics of individual variables • We also have statistics that relate pairs of variables

  36. Interactions Sometimes two variables appear related: • smoking and lung cancers • height and weight • years of education and income • engine size and gas mileage • GMAT scores and MBA GPA • house size and price

  37. Interactions • Some of these variables would appear to positively related & others negatively • If these were related, we would expect to be able to derive a linear relationship: y = a + bx • where, b is the slope, and • a is the intercept

  38. Linear Relationships • We will be deriving linear relationships from bivariate (two-variable) data • Our symbols will be:

  39. Example • Consider the following example comparing the returns of Consolidated Moose Pasture stock (CMP) and the TSE 300 Index • The next slide shows 25 monthly returns

  40. Example Data

  41. Example • From the data, it appears that a positive relationship may exist • Most of the time when the TSE is up, CMP is up • Likewise, when the TSE is down, CMP is down most of the time • Sometimes, they move in opposite directions • Let’s graph this data

  42. Graph Of Data

  43. Example Summary Statistics • The data do appear to be positively related • Let’s derive some summary statistics about these data:

  44. Observations • Both have means of zero and standard deviations just under 3 • However, each data point does not have simply one deviation from the mean, it deviates from both means • Consider Points A, B, C and D on the next graph

  45. Graph of Data

  46. Implications • When points in the upper right and lower left quadrants dominate, then the sums of the products of the deviations will be positive • When points in the lower right and upper left quadrants dominate, then the sums of the products of the deviations will be negative

  47. An Important Observation • The sums of the products of the deviations will give us the appropriate sign of the slope of our relationship

  48. Covariance(Showing the formula only to demonstrate a concept)

  49. Covariance

  50. Covariance • In the same units as Variance (if both variables are in the same unit), i.e. units squared • Very important element of measuring portfolio risk in finance

More Related