350 likes | 516 Views
Descripció Univariant. MPG 30,8 31,7 30,1 31,6 32,1 33,3 31,3 31,0 32,0 32,4 30,9 30,4 32,5 30,3 31,3 32,1 32,5 31,8 30,4 30,5 32,0 31,4 30,8 32,8 32,0 31,5 32,4 31,0 29,8 31,1 32,3 32,7 31,2 30,6 31,7 31,4 32,2 31,5 31,7 30,6 32,6 31,4 31,8 31,9 32,8
E N D
MPG 30,8 31,7 30,1 31,6 32,1 33,3 31,3 31,0 32,0 32,4 30,9 30,4 32,5 30,3 31,3 32,1 32,5 31,8 30,4 30,5 32,0 31,4 30,8 32,8 32,0 31,5 32,4 31,0 29,8 31,1 32,3 32,7 31,2 30,6 31,7 31,4 32,2 31,5 31,7 30,6 32,6 31,4 31,8 31,9 32,8 31,5 31,6 30,6 32,2 Car Mileage data (consum de gasolina)
>data=read.table("D:/Albert/COURSES/CursLlibreBowerman/Datasets - Text/GasMiles.txt", header=TRUE) > names(data) [1] "MPG" > dim(data) [1] 49 1 > attach(data) > stem(MPG) The decimal point is at the | 29 | 8 30 | 1344 30 | 5666889 31 | 001233444 31 | 55566777889 32 | 0001122344 32 | 556788 33 | 3 Descripció Univariant > summary(data) MPG Min. :29.80 1st Qu.:31.00 Median :31.60 Mean :31.55 3rd Qu.:32.10 Max. :33.30 > data MPG 1 30.8 2 31.7 3 30.1 4 31.6 5 32.1 6 33.3 .... 47 31.6 48 30.6 49 32.2
Amb SPSS Copiar / pegar les dades a la fulla de càlcul de SPSS (amb , en lloc de punts, si estem En la versió espanyola de SPSs DESCRIPTIVES VARIABLES=gas /STATISTICS=MEAN STDDEV MIN MAX .
histograma GRAPH /HISTOGRAM(NORMAL)=gas .
histograma GRAPH /HISTOGRAM(NORMAL)=gas .
Box Plot Q1 Q3 Inner Fences Q1 - 1.5* IQR Q3 + 1.5* IQR Mediana Min Max min max Outer Fences Q1 - 3* IQR Q3 + 3* IQR 3* IQR defines the outer fences, points Beyond that fences are extreme outliers Points beyond the inner fences but below outer fences are mild outliers.
Box Plot Inner fence: Inner fance Inner fence: Inner fance 1.5*IQR min max IQR Inner Fences Q1 - 1.5* IQR Q3 + 1.5* IQR 3* IQR defines the outer fences, points Beyond that fences are extreme outliers Points beyond the inner fences but below outer fences are mild outliers.
Box-Plot EXAMINE VARIABLES=gas /COMPARE VARIABLE/PLOT=BOXPLOT/STATISTICS=NONE/NOTOTAL /MISSING=LISTWISE .
Cross-section data: bank data > data=read.table("D:/Albert/COURSES/cursDAS/AS2003/DATA/BANK.TXT", header=TRUE) > dim(data) [1] 100 9 > names(data) [1] "LSALNOW" "LSALBEG" "SEX" "JOBCAT" "RACE" "EDLEVEL" "TIME" [8] "AGE" "WORK" > data[sample(1:dim(data)[1],10),] LSALNOW LSALBEG SEX JOBCAT RACE EDLEVEL TIME AGE WORK 25 9.4125 8.7483 0 3 0 12 80 61.67 38.33 47 8.9227 8.3428 1 1 0 15 90 58.00 4.50 8 10.0078 9.5104 0 4 0 19 81 30.75 5.17 33 9.5324 8.4888 1 2 0 12 77 24.33 0.33 97 8.8217 8.3138 1 1 1 12 72 51.50 22.58 100 8.9065 8.3138 1 1 1 12 85 51.00 19.00 32 9.5104 8.6995 0 3 0 12 83 50.25 23.67 94 8.8479 8.3138 1 1 1 12 72 46.50 9.67 39 9.0711 8.5132 1 1 0 8 74 59.83 26.50 36 9.1695 8.5942 1 1 0 12 98 47.33 20.33 > data[runif(dim(data)[1])<.1,]
Salnow by sex (boxplot) boxplot(SALNOW ~SEX, col=c("blue", "green"))
Red is kernel density Green is the normal distribution > summary(INCOME) Min. 1st Qu. Median Mean 3rd Qu. Max. 2.00 14.00 20.00 22.44 30.00 100.00
Log of income linc=log(INCOME) hist(linc,12, prob= TRUE, col='blue') lines(density(linc,bw=0.4), col='red') mu=mean(linc) sd=sqrt(var(linc)) lines(sort(linc),dnorm(sort(linc),mu,sd), col='green') Red is kernel density Green is the normal distribution
. summarize hsnotpau Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- hsnotpau | 3609 5.616312 1.123641 1.44 9.6
Proporció d’estudiants entre notes • La distribució de les notes d’un examen test és aproximadament • normal amb mitjana 6 i desviació tipica 1.7 (5.6 i desviació tipica 1.1). • Trobeu: • Els quartils de la distribució: • percentage aproximat d’estudiants amb una puntuació • entre 5 i 7. • c) El % d’estudiants que suspenen (suspenen amb nota <5). • d) El % amb nota més gran que 7. • e) Probabilitat que al triar 5 individus de la població d’estudiants • que han fet el test, n’hi hagi com a mínim 2 que tenen nota • superior a 7. • f) Quina es la distribució de la mitjana de notes de 10 estudiants • escollits a l’atzar de la població que ha realitzat el test ?.
Solució: • Distribució mostral: • La mitjana mostral d’una • mostra de 10 estudiants • té distribució normal, de • mitjana 6 i desviació típica • > 1.7/sqrt(10) • [1] 0.5375872 • Quartils : • > qnorm(.25, 6, 1.7) • [1] 4.853367 • > qnorm(.5, 6, 1.7) • [1] 6 • > qnorm(.75, 6, 1.7) • [1] 7.146633 • Percentils : • > pnorm(5,6,1.7) • [1] 0.2781872 • > pnorm(7,6,1.7) • [1] 0.7218128 • > pnorm(7,6,1.7) - pnorm(5,6,1.7) • [1] 0.4436256 • > 1- pnorm(7,6,1.7) • [1] 0.2781872 • > pbinom(1, 5, 1- pnorm(7,6,1.7)) • [1] 0.5735169
. . summarize hsnotpau, detail hsnotpau ------------------------------------------------------------- Percentiles Smallest 1% 3 1.44 5% 3.81 2.11 10% 4.21 2.27 Obs 3609 25% 4.85 2.3 Sum of Wgt. 3609 50% 5.58 Mean 5.616312 Largest Std. Dev. 1.123641 75% 6.38 8.94 90% 7.09 9.07 Variance 1.262569 95% 7.61 9.37 Skewness .0930459 99% 8.25 9.6 Kurtosis 2.983357
Funció d densitat de distribució normal Applet de la distribució Normal a : Statistical Applets: http://bcs.whfreeman.com/ips4e/pages/bcs-main.asp?v=category&s=00010&n=99000&i=99010.01&o Taules de la distribució normal: Taules Estadístiques : http://bcs.whfreeman.com/ips4e/pages/bcs-main.asp?v=category&s=00100&n=99000&i=99100.01&o Taules de la distribució normal a R: pnorm() qnorm() Per exemple: > pnorm(1.87) [1] 0.969258 > pnorm(-1.2) [1] 0.1150697 > qnorm(.975) [1] 1.959964 > qnorm(.25) [1] -0.6744898 % d’estudiants amb una nota entre 5 i 7 ? (mitjana = 5.616312 desviació típica = 1.123641 ) Z2=(7- 5.616312)/1.123641 Z1=(5- 5.616312)/1.123641 > pnorm(Z2) - pnorm(Z1) [1] 0.5992435 aproximadament un 60%. Més directe: pnorm(7, 5.616312,1.123641)- pnorm(5, 5.616312,1.123641) [1] 0.5992435
Family consumption data (family.dta ): summary statistics . summarize exp1_1, detail ------------------------------------------------------------- Percentiles Smallest 1% .1520551 7.18e-06 5% .3881256 7.65e-06 10% .5420735 .0000112 Obs 2640 25% .8613541 .0000267 Sum of Wgt. 2640 50% 1.294648 Mean 1.473449 Largest Std. Dev. .9169822 75% 1.901873 8.024636 90% 2.559126 8.826962 Variance .8408563 95% 3.10731 9.368608 Skewness 2.150655 99% 4.331305 10.20112 Kurtosis 13.92168
. summarize newfood, detail BC(exp1_1,.367) ------------------------------------------------------------- Percentiles Smallest 1% -1.35978 -2.689332 5% -.7995382 -2.6885 10% -.5484188 -2.68311 Obs 2640 25% -.1452355 -2.667499 Sum of Wgt. 2640 50% .2708729 Mean .2863131 Largest Std. Dev. .6866023 75% .7250079 3.126675 90% 1.122054 3.334947 Variance .4714228 95% 1.406071 3.468853 Skewness .0912667 99% 1.941548 3.66543 Kurtosis 4.251757 .
Pisa 2003 > Rendiment en Matemàtiques, > Nombre de llibres a casa
Some exercises for the practice on the Normal Distribution Exercises 1. The heights of adult men are normally distributed with a mean of 69.5 inches and a variance of 7.025 inches. Find the probabilities that a man chosen at random will be (a) at least 72 inches tall, (b) at most 72 inches tall. 2. Scores on standard IQ Tests are usually designed to be normally distributed with a mean of 100 and a standard deviation of 15. On such a test, find the probability that a person chosen at random will score (a) below 90, (b) above 90. 3. On American Roulette wheels, the probability of the ball landing on red is 18 / 38. Suppose 200 bets are placed on red. Use the Normal Approximation of the Binomial to approximate the probability of there being from 100 to 120 winners. 4. It is estimated that Americans average 200 deaths yearly (per 100,000 people) from heart attacks. Use the Normal Approximation of the Poisson to approximate the probability that 180 to 210 such deaths will occur in a random group of 100,000 Americans during a given year.
Mean Value (1)(Mean of a random variable) When a random phenomenon is repeated many times, the proportion of trials on which an outcome occurs eventually approaches the probability of the outcome. If the outcomes are numerical, the average of the observed outcomes eventually approaches the expected value. Sometimes we express the random outcome as X, a random variable; then the expected value is also called the mean of X. http://www.whfreeman.com/scc/con_index.htm?99spt