1 / 91

Overview of some tests

Overview of some tests. Thomas INGICCO. J.L.T. Géricault, Le Radeau de La Méduse J.L.T. Géricault, The Raft of The Medusa. Chi square test. Aim : Comparison of observed effectives Oij to theoretical effectives Eij

eadoin
Download Presentation

Overview of some tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of some tests Thomas INGICCO J.L.T. Géricault, Le Radeau de La Méduse J.L.T. Géricault, The Raft of The Medusa

  2. Chi square test Aim: Comparison of observed effectives Oijto theoretical effectives Eij Are the lines and columns of a crossed table independant?Meaningthatbeing part of the first variable has no influence on the modality of being part of the second variable.

  3. Chi square test Aim: Comparison of observed effectives Oijto theoretical effectives Eij Are the lines and columns of a crossed table independant. Meaningthatbeing part of the first variable has no influence on the modality of being part of the second variable. Measured variable: A qualitative variable withk classes

  4. Chi square test Aim: Comparison of observed effectives Oijto theoretical effectives Eij Are the lines and columns of a crossed table independant. Meaningthatbeing part of the first variable has no influence on the modality of being part of the second variable. Measured variable: A qualitative variable withk classes Conditions of utilization: The class of the variables must be exclusives Cochran’srule must berespected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5

  5. Chi square test Aim: Comparison of observed effectives Oijto theoretical effectives Eij Are the lines and columns of a crossed table independant. Meaningthatbeing part of the first variable has no influence on the modality of being part of the second variable. Measured variable: A qualitative variable withk classes Conditions of utilization: The class of the variables must be exclusives Cochran’srule must berespected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5 Test hypotheses: H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed population H1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population

  6. Chi square test Aim: Comparison of observed effectives Oijto theoretical effectives Eij Are the lines and columns of a crossed table independant. Meaningthatbeing part of the first variable has no influence on the modality of being part of the second variable. Measured variable: A qualitative variable withk classes Conditions of utilization: The class of the variables must be exclusives Cochran’srule must berespected: in each class Eij ≥ 5. But possiblity to have some classes with 1 ≤ Oij≤ 5 if a minimum of 80% of the totality of the classes have Oij > 5 Test hypotheses: H0: πi = Pthéo i The theoretical proportions Pthéo i are the real proportions in the observed population H1 bilat: One at least of the theoretical proportions is not the real proportion in the observed population The statisticis: In R: sum((Oij - Eij)^2/Eij)

  7. Chi square test In details: Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE)

  8. Chi square test In details: Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE) obs1<-data.frame(Ceram[,10:11]) obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2) addmargins(obs3)

  9. Chi square test In details: Ceram<-read.table(i"K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE) obs1<-data.frame(Ceram[,10:11]) obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2) addmargins(obs3) graphics.off() par(cex.lab=1.5, xpd=NA, font=2) mosaicplot(t(obs3), main=NULL, cex.axis=1.1)

  10. Chi square test In details: Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Data/Ceramics.txt" ,header=TRUE) obs1<-data.frame(Ceram[,10:11]) obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2) addmargins(obs3) graphics.off() par(cex.lab=1.5, xpd=NA, font=2) mosaicplot(t(obs3), main=NULL, cex.axis=1.1) obs3theo<-suppressWarnings(chisq.test(obs3)$expected) addmargins(obs3theo) nij<-obs3 tij<-obs3theo chi2.calc<-sum((nij-tij)^2/tij) chi2.calc k<-dim(obs3)[1] c<-dim(obs3)[2] nu=(k-1)*(c-1) nu pchisq(chi2.calc, nu, lower.tail=FALSE)

  11. Chi square test In details: # Test in R chisq.test(obs3)

  12. Fisher test Aim: Comparison of observed effectives G & F (independance of 2 qualitatives variables)as observed proportions PG1/F1 and PG1/F2 (equality of the proportions)

  13. Fisher test Aim: Comparison of observed effectives G & F (independance of 2 qualitatives variables)as observed proportions PG1/F1 and PG1/F2 (equality of the proportions) Measured variable: Two qualitative variables F & Gwith 2 classes

  14. Fisher test Aim: Comparison of observed effectives G & F (independance of 2 qualitatives variables)as observed proportions PG1/F1 and PG1/F2 (equality of the proportions) Measured variable: Two qualitative variables F & Gwith 2 classes Conditions of utilization: The class of the variables must be exclusives Qualitative variables are nominal

  15. Fisher test Aim: Comparison of observed effectives G & F (independance of 2 qualitatives variables)as observed proportions PG1/F1 and PG1/F2 (equality of the proportions) Measured variable: Two qualitative variables F & Gwith 2 classes Conditions of utilization: The class of the variables must be exclusives Qualitative variables are nominal Test hypotheses: H0: πG1/F1 = πG1/F12 Proportions are identical i n the target population H1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target population H1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 isstrictlysuperior to the targetpopulation H1 unilatleft: πG1/F1 <πG1/F12 Proportion πG1/F1 isstrictlyinferiorto the target population

  16. Fisher test Aim: Comparison of observed effectives G & F (independance of 2 qualitatives variables)as observed proportions PG1/F1 and PG1/F2 (equality of the proportions) Measured variable: Two qualitative variables F & Gwith 2 classes Conditions of utilization: The class of the variables must be exclusives Qualitative variables are nominal Test hypotheses: H0: πG1/F1 = πG1/F12 Proportions are identical i n the target population H1 bilat: πG1/F1 ≠ πG1/F12 Proportions are different in the target population H1 unilat right: πG1/F1 > πG1/F12 Proportion πG1/F1 isstrictlysuperior to the targetpopulation H1 unilatleft: πG1/F1 <πG1/F12 Proportion πG1/F1 isstrictlyinferiorto the target population The statisticis: In R: sum((Oij - Eij)^2/Eij)

  17. Fisher test In details:

  18. Fisher test In details:

  19. Fisher test In details:

  20. Fisher test In details:

  21. Fisher test In details: Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE)

  22. Fisher test In details: Ceram<-read.table("K:/Cours/Philippines/Statistics-210/Lecture-4/Ceramics.txt",header=TRUE) obs1<-Ceram[,c(12,9)] obs2<-na.omit(obs1) for(i in 1:length(obs2)){obs2[,i]<-factor(obs2[,i])} obs3<- table(obs2) # obs3<-t(obs3) # obs4<-obs3 ; obs4[,1]<-obs3[,2] ; obs4[,2]<-obs3[,1] ; dimnames(obs4)[[2]][1] <- dimnames(obs3)[[2]][2] ; dimnames(obs4)[[2]][2] <- dimnames(obs3)[[2]][1] ; obs3<-obs4 addmargins(obs3)

  23. Fisher test In details: graphics.off() pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1)) barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8) position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8)

  24. Fisher test In details: graphics.off() pG1.Fi<- obs3[, 1]/ margin.table(obs3, 1) par(mar=c(5.1, 5.1, 4.1, 2.1)) barplot(pG1.Fi, xlab = paste(labels(dimnames(obs3))[2], " / ", labels(dimnames(obs3))[1]), xaxt = "n", ylab="Proportion", ylim=range(0, max(pG1.Fi)+ 0.15), cex.lab=2, cex.axis=1.8) position.labels <- barplot(pG1.Fi, plot = FALSE)[] axis(side=1, at = position.labels, labels = c(paste(colnames(obs3)[1], " / ", rownames(obs3)[1]), paste(colnames(obs3)[1], " / ", rownames(obs3)[2])), cex.axis=1.8) windows() par(cex.lab=2, xpd=NA, font=2) mosaicplot(t(obs3), main=NULL, cex.axis=1.5)

  25. Fisher test In details: n11<-obs3[1, 1] n1.<-margin.table(obs3, 1)[1] n21<-obs3[2, 1] n2.<-margin.table(obs3, 1)[2] pG1.F1<-n11/n1. pG1.F2<-n21/n2. t(data.frame(pG1.F1, pG1.F2))

  26. Fisher test In details: n11<-obs3[1, 1] n1.<-margin.table(obs3, 1)[1] n21<-obs3[2, 1] n2.<-margin.table(obs3, 1)[2] pG1.F1<-n11/n1. pG1.F2<-n21/n2. t(data.frame(pG1.F1, pG1.F2)) n12<-obs3[1,2] n22<-obs3[2,2] n.1<-margin.table(obs3,2)[1] n.2<-margin.table(obs3,2)[2] pG2.F1<-n12/n1. pG2.F2<-n22/n2. pF1.G1<-n11/n.1 pF1.G2<-n12/n.2 pF2.G1<-n21/n.1 pF2.G2<-n22/n.2 t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2))

  27. Fisher test In details: n11<-obs3[1, 1] n1.<-margin.table(obs3, 1)[1] n21<-obs3[2, 1] n2.<-margin.table(obs3, 1)[2] pG1.F1<-n11/n1. pG1.F2<-n21/n2. t(data.frame(pG1.F1, pG1.F2)) n12<-obs3[1,2] n22<-obs3[2,2] n.1<-margin.table(obs3,2)[1] n.2<-margin.table(obs3,2)[2] pG2.F1<-n12/n1. pG2.F2<-n22/n2. pF1.G1<-n11/n.1 pF1.G2<-n12/n.2 pF2.G1<-n21/n.1 pF2.G2<-n22/n.2 t(data.frame(pG2.F1, pG2.F2, pF1.G1, pF1.G2, pF2.G1, pF2.G2)) n11<-obs3[1,1] NFE.calc<-n11 NFE.calc

  28. Fisher test In details: n1.<-margin.table(obs3,1)[1] n<- margin.table(obs3) n.1<-margin.table(obs3,2)[1] p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1) p.right p.left

  29. Fisher test In details: n1.<-margin.table(obs3,1)[1] n<- margin.table(obs3) n.1<-margin.table(obs3,2)[1] p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1) p.right p.left if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left> d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ;

  30. Fisher test In details: n1.<-margin.table(obs3,1)[1] n<- margin.table(obs3) n.1<-margin.table(obs3,2)[1] p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1) p.right p.left if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.left> d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left> d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)}

  31. Fisher test In details: n1.<-margin.table(obs3,1)[1] n<- margin.table(obs3) n.1<-margin.table(obs3,2)[1] p.right<-phyper(NFE.calc-1,n1.,n-n1.,n.1,lower.tail=FALSE) p.left<-phyper(NFE.calc,n1.,n-n1.,n.1) p.right p.left if(p.right < p.left) {p.value1<-p.right ; NFE.left<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.left<-Inf ; while(NFE.left >= 0 & d.NFE.gauche > d.NFE.calc) { NFE.left<-NFE.left - 1 ; d.NFE.left<- round(dhyper(NFE.left,n1.,n-n1.,n.1),12)} ; if(d.NFE.left > d.NFE.calc){p.value2<-0} else{p.value2<- phyper(NFE.left,n1.,n-n1.,n.1)}} else{p.value1<-p.left ; NFE.right<-NFE.calc ; d.NFE.calc<-round(dhyper(NFE.calc, n1. ,n-n1., n.1),12) ; d.NFE.right<-Inf ; while(d.NFE.right > d.NFE.calc){ NFE.right<-NFE.right + 1 ; d.NFE.right<- round(dhyper(NFE.right,n1.,n-n1.,n.1),12)} ; p.value2<- phyper(NFE.right-1,n1.,n-n1.,n.1,lower.tail=FALSE)} p.value<-p.value1+p.value2 p.value

  32. Fisher test In details: Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1) Pn11 dhyper(n11,n1.,n-n1.,n.1) n11<-obs3[1,1]

  33. Fisher test In details: Pn11<-choose(n1.,n11)*choose(n-n1.,n.1-n11)/choose(n,n.1) Pn11 dhyper(n11,n1.,n-n1.,n.1) n11<-obs3[1,1] # Test in R fisher.test(obs3)

  34. Student t test Aim: Comparison of twoobservedmeansm1 and m2

  35. Student t test Aim: Comparison of twoobservedmeansm1 and m2 Measured variable: A quantitative variable and a qualitative variable withtwo classes

  36. Student t test Aim: Comparison of twoobservedmeansm1 and m2 Measured variable: A quantitative variable and a qualitative variable withtwo classes Conditions of utilization: The quantitative variable must follow a normal law The quantitative variable maybecontinuous or discrete

  37. Student t test Aim: Comparison of twoobservedmeansm1 and m2 Measured variable: A quantitative variable and a qualitative variable withtwo classes Conditions of utilization: The quantitative variable must follow a normal law The quantitative variable maybecontinuous or discrete Test hypotheses: H0: μ1 = μ2 Means are identical in the target pop. H1 bilat: μ1≠ μ2 Means are differentin the target pop. H1 unilat right: μ1> μ2 Meanissrtictlysuperior to the mean in the target pop. H1 unilatleft: μ1< μ2Meanissrtictlyinferiorto the mean in the target pop.

  38. Student t test Aim: Comparison of twoobservedmeansm1 and m2 Measured variable: A quantitative variable and a qualitative variable withtwo classes Conditions of utilization: The quantitative variable must follow a normal law The quantitative variable maybecontinuous or discrete Test hypotheses: H0: μ1 = μ2 Means are identical in the target pop. H1 bilat: μ1≠ μ2 Means are differentin the target pop. H1 unilat right: μ1> μ2 Meanissrtictlysuperior to the mean in the target pop. H1 unilatleft: μ1< μ2Meanissrtictlyinferiorto the mean in the target pop. The statisticis: with: In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5

  39. Student t test In details: obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2]) for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo) obs3

  40. Student t test In details: obs1<-data.frame(Ceramics[which(Ceramics$Base=="Round" | Ceramics$Base=="Flat"), c(2,13)]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2]) for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo) obs3 n1<-length(na.omit(obs3[, 1])) n2<- length(obs3[, 2]) m1<-mean(na.omit(obs3[, 1])) m2<-mean(obs3[, 2]) s.1<- sd(na.omit(obs3[, 1])) s.2<- sd(obs3[, 2]) param <- data.frame(c(n1, n2), c(m1, m2), c(s.1, s.2)) names(param) <- c("Effectives", "Mean", "Standard deviation") row.names(levels(obs2[,2])) param

  41. Student t test In details: s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5 t.calc In details:

  42. Student t test In details: s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5 t.calc nu<-n1+n2-2 nu min(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2 In details:

  43. Student t test In details: s2<-((n1-1)*s.1^2+(n2-1)*s.2^2)/(n1+n2-2) t.calc<- (m1-m2)/(s2*(1/n1+1/n2))^0.5 t.calc nu<-n1+n2-2 nu min(pt(t.calc, nu, lower.tail=FALSE), pt(t.calc, nu))*2 # Test in R t.test(obs3[, 1],obs3[, 2],var.equal=TRUE) In details:

  44. Analysis of variance (ANOVA) Aim: Comparison of at least twoobservedmeans

  45. Analysis of variance (ANOVA) Aim: Comparison of at least twoobservedmeans Measured variable: A quantitative variable and a qualitative variable withk classes

  46. Analysis of variance (ANOVA) Aim: Comparison of at least twoobservedmeans Measured variable: A quantitative variable and a qualitative variable withk classes Conditions of utilization: The quantitative variable must follow a normal law The variances of the quantitative variable in each classes of the qualitative variable must beequal () -> If conditions are not fulfilled, see the Kruskal-Wallis test

  47. Analysis of variance (ANOVA) Aim: Comparison of at least twoobservedmeans Measured variable: A quantitative variable and a qualitative variable withk classes Conditions of utilization: The quantitative variable must follow a normal law The variances of the quantitative variable in each classes of the qualitative variable must beequal () -> If conditions are not fulfilled, see the Kruskal-Wallis test Test hypotheses: H0: μ1 = μ2 Means are identical in the target pop. H1 bilat: μ1≠ μ2 One of the meansat least isdifferentin the target pop.

  48. Analysis of variance (ANOVA) Aim: Comparison of at least twoobservedmeans Measured variable: A quantitative variable and a qualitative variable withk classes Conditions of utilization: The quantitative variable must follow a normal law The variances of the quantitative variable in each classes of the qualitative variable must beequal () -> If conditions are not fulfilled, see the Kruskal-Wallis test Test hypotheses: H0: μ1 = μ2 Means are identical in the target pop. H1 bilat: μ1≠ μ2 One of the meansat least isdifferentin the target pop. The statisticis: In R: (m1-m2)/(s2*(1/n1+1/n2))^0.5

  49. Analysis of variance (ANOVA) In details: Ceram<-read.table("K:/Cours/Philippines/Statistics210/Data/Ceramics.txt",header=TRUE) obs1<-Ceram[,c(7,10)] obs1[,2]<-factor(obs1[,2]) obs2<-na.omit(obs1) nc.max<-max(table(obs2[,2])) nb.na<-nc.max- table(obs2[,2]) tempo<-split(obs2[,1], obs2[,2]) for(i in 1:length(tempo)) {tempo[[i]]<-append(tempo[[i]],rep(NA,nb.na[i]))} obs3<-data.frame(tempo) obs3

  50. Analysis of variance (ANOVA) In details: graphics.off() k<-nlevels(obs2[, 2]) stripchart(obs2[, 1]~obs2[, 2], method="jitter", jitter=0.1, vertical=FALSE, ylim=range(0.5, k+0.5), group.names=levels(obs2[, 2]), xlab= names(obs2)[1], ylab=names(obs2)[2], pch=16, cex=1.2) mc<-sapply(split(obs2[, 1], obs2[, 2]), mean) for(i in 1:k){segments(mc[i], i-0.25, mc[i], i+0.25, lwd=3, col=gray(0.5))}

More Related