940 likes | 1.17k Views
Random Variable. Qualitative (categorical). Quantitative (numeric). Ratio. Interval. Nominal. Ordinal. Continuous. Discrete. SUMMARIZING NUMERIC DATA. Simple Frequency Table Grouped Frequency Table Histogram Frequency Polygon Cumulative Frequency Distribution. 3- 3.
E N D
Random Variable Qualitative (categorical) Quantitative (numeric) Ratio Interval Nominal Ordinal Continuous Discrete
SUMMARIZING NUMERIC DATA • Simple Frequency Table • Grouped Frequency Table • Histogram • Frequency Polygon • Cumulative Frequency Distribution
3- 3 Measures of Central Location • Arithmetic Mean • Median • Mode.
3- 4 Mean for grouped data:
3- 5 Median for grouped data:
3- 6 Mode for grouped data:
Measures of Dispersion (Variability) • Range • Variance and Standard Deviation • Coefficient of Variation • Non-central Locations: Inter-fractile Ranges
Standard Deviation (ungrouped data) (grouped data)
3- 10 Empirical Rule: 68% 95% 99.7% m-2s m-1s m+1s m+2s m m-3s m+ 3s
3- 11 The Relative Positions of the Mean, Median, and Mode: Symmetric Distribution Zero skewness → :Mean =Median = Mode
3- 12 Mean>Median>Mode Positively skewed:
3- 13 Negatively Skewed: Mean<Median<Mode
Non-Central Location Measures (Fractiles or Quantiles) • Quartiles • Sextiles • Octiles • Deciles • Percentiles
Calculating Quartiles for Grouped Data The jth quartile for grouped data is given by: n =sample size L = lower limit of jth quartile class F = < cumulative frequency of immediately preceding class. fQj = frequency of jth quartile class.
Example A sample of 20 randomly-selected hospitals in the US revealed the following daily charges (in $) for a semiprivate room. 1.1 Using class intervals of width 10 units, construct a less-than cumulative frequency distribution of the above data. Let 120 units be the lower limit of the smallest class. 1.2 Draw a less-than ogive and use it to estimate the 80th percentile. 1.3 For the grouped data of question 1.1 above, calculate: 1.3.1 The mean, median and mode 1.3.2 The interquartile range.. 1.3.3 The coefficient of variation. Interpret the result obtained.
Solution 1.1
1.2 80th percentile = 158
1.3.3 CV = standard deviation/mean → CV = 10.8/148 0.073 ≡ 7.3% → data clustered around mean.
BASIC PROBABILITY CONCEPTS • Random Experiment • Sample Space • Event • Collectively Exhaustive Events • Dependent Events • Independent Events
Marginal Probability • Joint Probability: P(A∩B) = P(B∩A) • Conditional Probability: P(A|B) = P(A∩B)/P(B) P(B|A) = P(A∩B)/P(B) .
Complement Rule: P(A’) = 1 – P(A) or P(A) = 1 – P(A’)
Special Multiplication Rule: P(A and B) = P(A)P(B) = P(B)P(A) General Multiplication Rule: P(A and B) = P(AB) = P(A)P(B/A) or P(A and B) = P(AB) = P(B)P(A/B)
Special Addition Rule: P(A or B) = P(A)+P(B) GeneralAddition Rule: P(A or B) = P(A)+P(B) – P(A and B)
Example • A company manufactures a total of 8000 motorcycles a month • in three plants A, B and C. Of these, plant A manufactures • 4000, and plant B manufactures 3000. At plant A, 85 out of • 100 motorcycles are of standard quality or better. At plant B, • 65 out of 100 motorcycles are of standard quality or better • and at plant C, 60 out of 100 motorcycles are of standard • quality or better. The quality controller randomly selects a • motorcycle and finds it to be of substandard quality. Calculate • the probability that it has come from plant B.
Solution P(B/substd) = No. of substd items from B/Total no. of substd items No of substd items from A = 4000x(100 – 85)/100 = 40x15 = 600 No of substd items from B = 3000x(100 – 65)/100 = 30x35 = 1050 No of substd items from C =1000x(100 – 60)/100 = 10x40 = 400 Total number of substd items = 600 +1050 + 400 = 2050 P(B/substd) = 1050/2050 = 0.512
PROBABILITY DISTRIBUTIONS • Properties • Discrete distributions • Normal distributions
Example According to a leading newspaper, the largest cellular phone service in the US has about 36 million subscribers out of a total of 180 million cell phone users. If six cell phone users are randomly selected, what is the probability that at least two of them subscribes to this service?
Example • Customers arrive randomly and independently at a service point • at an average rate of 30 per hour. • 1. Calculate the probability that exactly 20 customers arrive at • the service point during any given hour. • 2. Calculate the probability that • during any 5 minute period at least 3 customers arrive at the • service point.
Solution 1. λ = 30/hr 2. ; λ = 30/60 min = 2.5/5 min → P(x ≥ 3) = 1 - - - = 0.497 - -
Normal probability distribution Standard normal or z-distribution
r a l i t r b u i o n : m = 0 , s2 = 1 0 . 4 0 . 3 0 . 2 x ( f 0 . 1 . 0 - 5 x Normal Distribution Theoretically, curve extends to infinity Normal curve is symmetrical a Mean, median, and mode are equal
Example • Six hundred candidates wrote an entrance test for admission to • a management course. The marks obtained by the candidates • were found to be normally distributed with a mean of 132 • marks and a standard deviation of 18 marks. • 1. How many candidates scored between 140 and 160 marks? • 2. If the top 60 performers were given confirmed admission, • calculate the minimum mark (to the nearest integer) above • which a candidate would be guaranteed admission?
Solution 1. Z1 =(140 -132)/18 = 0.4444 → P1 ≈ 0.172 Z2 =(160 -132)/18 = 1.5556 → P2 ≈ 0.440 → P (160<X<140) ≈ 0.440 – 0.172 = 0.268 → 0.268 x 600 students ≈ 161 students
Let xc denote the minimum mark. 2. 60/600 = 0.1 = 10%. P(0 <z<zc) = 0.50 - 0.10 = 0.4 → zc = 1.28
HYPOTHESIS TESTING • What is a Hypothesis? • What is Hypothesis Testing?
Basic Terms • Null hypothesis • Alternative hypothesis • Level of significance • Type I error • Type II error • Critical value • Test statistic • Rejection area • Acceptance area • One-tailed test • Two-tailed Test
Five-Step Procedure for Hypothesis Testing Step 1: State the null and alternative hypotheses Step 2: Determine the critical value associated with the the level of significance Step 3: Identify and calculate the test statistic Step 4: Formulate and apply the decision rule Step 5: Draw a conclusion
Testing a Single Population Mean Large sample( n >30) Test statistic: Small sample( n <30) Test statistic:
Testing a Single Population Proportion: Large sample( n > 30) Test statistic: Small sample( n< 30) Test statistic: