Poids de grains de colza Présentation du problème : On souhaite étudier l effet des 2 facteurs rotation et fertilisation sur le poids des grains de colza. 1 Données observées > colza<-read.table('colza.txt',header=t) > names(colza) #nom des colonnes [1] "Fertilisation" "Rotation" "PdsGrains" > colza Fertilisation Rotation PdsGrains 1 1 A 27.6 2 1 A 16.3 3 1 A 11.4 4 1 A 38.2 5 1 A 38.1 6 1 A 24.7 7 1 A 22.7 8 1 A 21.7 9 1 A 20.6 10 1 A 19.8 11 1 B 32.1 12 1 B 28.5 13 1 B 19.4 14 1 B 39.2 15 1 B 21.8 16 1 B 23.6 17 1 B 14.0 18 1 B 22.3 19 1 B 18.3 20 1 B 20.8 21 1 C 27.8 22 1 C 33.4 23 1 C 33.0 24 1 C 28.9 25 1 C 24.6 26 1 C 28.3 27 1 C 40.6 28 1 C 20.3 29 1 C 26.7 30 1 C 22.8 31 2 A 13.4 32 2 A 19.0 33 2 A 27.8 34 2 A 17.4 35 2 A 8.3 36 2 A 16.2 37 2 A 3.4 38 2 A 9.6 39 2 A 24.8 1
40 2 A 18.2 41 2 B 15.5 42 2 B 16.7 43 2 B 29.7 44 2 B 26.8 45 2 B 8.7 46 2 B 22.6 47 2 B 34.6 48 2 B 14.4 49 2 B 12.5 50 2 B 16.9 51 2 C 23.5 52 2 C 31.2 53 2 C 26.1 54 2 C 41.2 55 2 C 35.3 56 2 C 23.4 57 2 C 44.1 58 2 C 35.3 59 2 C 31.6 60 2 C 25.8 > class(colza$rotation) #verifier les types des variables [1] "factor" > class(colza$pdsgrains) [1] "numeric" > class(colza$fertilisation) [1] "integer" > colza$fertilisa<-as.factor(colza$fertilisation) #changer le type de Fertilisation > class(colza$fertilisa) [1] "factor" > table(colza$rotation,colza$fertilisa) 1 2 A 10 10 B 10 10 C 10 10 2 Description des données > summary(colza$pdsgrains) Min. 1st Qu. Median Mean 3rd Qu. Max. 3.40 18.00 23.45 24.02 29.10 44.10 > sd(colza$pdsgrains) [1] 8.936966 > # par rotation > by(colza$pdsgrains,colza$rotation,mean) colza$rotation: A [1] 19.96 colza$rotation: B [1] 21.92 colza$rotation: C [1] 30.195 2
> # par fertilisation > by(colza$pdsgrains,colza$fertilisa,mean) colza$fertilisa: 1 [1] 25.58333 colza$fertilisa: 2 [1] 22.46667 > # par traitement > colza$ferrot=paste(colza$fertilisa,colza$rotation,sep="") > by(colza$pdsgrains,colza$ferrot,mean) colza$ferrot: 1A [1] 24.11 colza$ferrot: 1B [1] 24 colza$ferrot: 1C [1] 28.64 colza$ferrot: 2A [1] 15.81 colza$ferrot: 2B [1] 19.84 colza$ferrot: 2C [1] 31.75 > by(colza$pdsgrains,colza$ferrot,sd) colza$ferrot: 1A [1] 8.61529 colza$ferrot: 1B [1] 7.368703 colza$ferrot: 1C [1] 5.862726 colza$ferrot: 2A [1] 7.438108 colza$ferrot: 2B [1] 8.273411 colza$ferrot: 2C [1] 7.24542 > fertil<-rep(c(1,2),each=3) > rotat<-rep(c('a','b','c'),2) > NBy <- c(by(colza$pdsgrains,colza[,c(2,1)],length)) > MeanBy<- c(by(colza$pdsgrains,colza[,c(2,1)],mean)) > StdDevBy <- c(by(colza$pdsgrains,colza[,c(2,1)],sd)) > rbind(fertil,rotat,round(rbind(nby,meanby,stddevby),2)) [,1] [,2] [,3] [,4] [,5] [,6] fertil "1" "1" "1" "2" "2" "2" rotat "A" "B" "C" "A" "B" "C" NBy "10" "10" "10" "10" "10" "10" MeanBy "24.11" "24" "28.64" "15.81" "19.84" "31.75" StdDevBy "8.62" "7.37" "5.86" "7.44" "8.27" "7.25" 3
> boxplot( colza$pdsgrains ~ colza$fertilisa*colza$rotation,ylab="pdsgrains",xlab="traitement") PdsGrains 10 20 30 40 1.A 2.A 1.B 2.B 1.C 2.C traitement > #graphe d'interaction > interaction.plot(colza$rotation,colza$fertilisa,colza$pdsgrains,fixed=true,col=2:3,leg.bty="o") 4
colza$fertilisa mean of colza$pdsgrains 20 25 30 1 2 A B C colza$rotation 3 Modèle d analyse de la variance à 2 facteurs avec interaction > #si on pose un modele d'anova, vérifier que les variables explicatives sont bien > #consideres comme des facteurs par R!!! > class(colza$fertilisa) [1] "factor" > class(colza$rotation) [1] "factor" > moda2i<-lm(pdsgrains ~ Fertilisa * Rotation, + data=colza) > summary(moda2i) Call: lm(formula = PdsGrains ~ Fertilisa * Rotation, data = colza) Residuals: Min 1Q Median 3Q Max -12.710-5.492-1.125 3.752 15.200 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 24.110 2.377 10.141 4.16e-14 *** Fertilisa2-8.300 3.362-2.469 0.0168 * RotationB -0.110 3.362-0.033 0.9740 5
RotationC 4.530 3.362 1.347 0.1835 Fertilisa2:RotationB 4.140 4.755 0.871 0.3878 Fertilisa2:RotationC 11.410 4.755 2.400 0.0199 * Residual standard error: 7.518 on 54 degrees of freedom Multiple R-squared: 0.3522, Adjusted R-squared: 0.2923 F-statistic: 5.873 on 5 and 54 DF, p-value: 0.0002106 > #somme des carres > #definition du modele nul > mod0=lm(pdsgrains ~ 1,data=colza) > anova(mod0,moda2i) Model 1: PdsGrains ~ 1 Model 2: PdsGrains ~ Fertilisa * Rotation Res.Df RSS Df Sum of Sq F Pr(>F) 1 59 4712.3 2 54 3052.5 5 1659.8 5.8726 0.0002106 *** SCM = 1659.8 SCR = 3052.5 SCT = 4712.3 > #analyse des effets > #type I > anova(moda2i) Response: PdsGrains Df Sum Sq Mean Sq F value Pr(>F) Fertilisa 1 145.70 145.70 2.5776 0.1142193 Rotation 2 1180.48 590.24 10.4417 0.0001466 *** Fertilisa:Rotation 2 333.63 166.82 2.9511 0.0607686. Residuals 54 3052.47 56.53 > #type II > library(car) #package a charger > Anova(modA2I) Anova Table (Type II tests) Response: PdsGrains Sum Sq Df F value Pr(>F) Fertilisa 145.70 1 2.5776 0.1142193 Rotation 1180.48 2 10.4417 0.0001466 *** Fertilisa:Rotation 333.63 2 2.9511 0.0607686. Residuals 3052.47 54 SCM = 1659.8 = 145.70 + 1180.48 + 333.63 6
> par(mfrow=c(2,2)) > plot(moda2i) Residuals vs Fitted Normal Q Q Residuals 15 5 5 15 14 47 4 Standardized residuals 1 0 1 2 14 47 4 20 25 30 2 1 0 1 2 Fitted values Theoretical Quantiles Standardized residuals 0.0 0.5 1.0 1.5 Scale Location 14 47 4 20 25 30 Standardized residuals 2 1 0 1 2 Constant Leverage: Residuals vs Factor Levels 47 4 14 Fertilisa : 2 1 Fitted values Factor Level Combinations > #Comparaison des moyennes de Rotation > library(lsmeans) #package a charger > lsmeans(moda2i,pairwise~rotation,adjust="bonferroni") $`Rotation lsmeans` Rotation lsmean SE df lower.cl upper.cl A 19.960 1.681179 54 16.58944 23.33056 B 21.920 1.681179 54 18.54944 25.29056 C 30.195 1.681179 54 26.82444 33.56556 $`Rotation pairwise differences` estimate SE df t.ratio p.value A - B -1.960 2.377546 54-0.82438 1.00000 A - C -10.235 2.377546 54-4.30486 0.00021 B - C -8.275 2.377546 54-3.48048 0.00300 p values are adjusted using the bonferroni method for 3 tests 4 Modèle d analyse de la variance à 2 facteurs sans interaction > moda2si<-lm(pdsgrains ~ Fertilisa + Rotation, + data=colza) > summary(moda2si) Call: lm(formula = PdsGrains ~ Fertilisa + Rotation, data = colza) 7
Residuals: Min 1Q Median 3Q Max -15.002-5.074-1.428 5.287 16.682 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 21.518 2.008 10.718 3.49e-15 *** Fertilisa2-3.117 2.008-1.552 0.12622 RotationB 1.960 2.459 0.797 0.42877 RotationC 10.235 2.459 4.162 0.00011 *** Residual standard error: 7.776 on 56 degrees of freedom Multiple R-squared: 0.2814, Adjusted R-squared: 0.2429 F-statistic: 7.311 on 3 and 56 DF, p-value: 0.0003203 > #test global du modele > anova(mod0,moda2si) Model 1: PdsGrains ~ 1 Model 2: PdsGrains ~ Fertilisa + Rotation Res.Df RSS Df Sum of Sq F Pr(>F) 1 59 4712.3 2 56 3386.1 3 1326.2 7.3109 0.0003203 *** > #type I > anova(moda2si) Response: PdsGrains Df Sum Sq Mean Sq F value Pr(>F) Fertilisa 1 145.7 145.70 2.4097 0.1262207 Rotation 2 1180.5 590.24 9.7615 0.0002307 *** Residuals 56 3386.1 60.47 > #type II > Anova(modA2sI) Anova Table (Type II tests) Response: PdsGrains Sum Sq Df F value Pr(>F) Fertilisa 145.7 1 2.4097 0.1262207 Rotation 1180.5 2 9.7615 0.0002307 *** Residuals 3386.1 56 SCM = 145.7 + 1180.5 = 1326.2 SCR = 3386.1 SCT = 4712.3 8
> #moyennes ajustees > lsmeans(moda2si,pairwise~rotation,adjust="bonferroni") $`Rotation lsmeans` Rotation lsmean SE df lower.cl upper.cl A 19.960 1.738766 56 16.47683 23.44317 B 21.920 1.738766 56 18.43683 25.40317 C 30.195 1.738766 56 26.71183 33.67817 $`Rotation pairwise differences` estimate SE df t.ratio p.value A - B -1.960 2.458987 56-0.79708 1.00000 A - C -10.235 2.458987 56-4.16228 0.00033 B - C -8.275 2.458987 56-3.36521 0.00416 p values are adjusted using the bonferroni method for 3 tests > par(mfrow=c(2,2)) > plot(moda2si) Residuals 10 0 10 20 Residuals vs Fitted 45 14 Standardized residuals 2 1 0 1 2 Normal Q Q 14 54 18 20 22 24 26 28 30 32 2 1 0 1 2 Fitted values Theoretical Quantiles Standardized residuals 0.0 0.5 1.0 1.5 Scale Location 45 14 18 20 22 24 26 28 30 32 Standardized residuals 2 1 0 1 2 Constant Leverage: Residuals vs Factor Levels 45 14 Fertilisa : 2 1 Fitted values Factor Level Combinations 5 Modèles d analyse de la variance à 1 facteur > with(colza,boxplot( PdsGrains ~ Rotation)) 9
10 20 30 40 A B C > modar<-lm(pdsgrains ~ Rotation, + data=colza) > summary(modar) Call: lm(formula = PdsGrains ~ Rotation, data = colza) Residuals: Min 1Q Median 3Q Max -16.560-5.314-1.040 4.850 18.240 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 19.960 1.760 11.340 3.07e-16 *** RotationB 1.960 2.489 0.787 0.434310 RotationC 10.235 2.489 4.112 0.000127 *** Residual standard error: 7.872 on 57 degrees of freedom Multiple R-squared: 0.2505, Adjusted R-squared: 0.2242 F-statistic: 9.526 on 2 and 57 DF, p-value: 0.0002697 > anova(mod0,modar) Model 1: PdsGrains ~ 1 10
Model 2: PdsGrains ~ Rotation Res.Df RSS Df Sum of Sq F Pr(>F) 1 59 4712.3 2 57 3531.8 2 1180.5 9.5259 0.0002697 *** > anova(modar) Response: PdsGrains Df Sum Sq Mean Sq F value Pr(>F) Rotation 2 1180.5 590.24 9.5259 0.0002697 *** Residuals 57 3531.8 61.96 > with(colza,boxplot(pdsgrains ~ Fertilisa)) 10 20 30 40 1 2 > modaf<-lm(pdsgrains ~ Fertilisa, + data=colza) > summary(modaf) Call: lm(formula = PdsGrains ~ Fertilisa, data = colza) Residuals: Min 1Q Median 3Q Max 11
-19.0667-5.7708-0.9333 5.6292 21.6333 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 25.583 1.620 15.79 <2e-16 *** Fertilisa2-3.117 2.291-1.36 0.179 Residual standard error: 8.873 on 58 degrees of freedom Multiple R-squared: 0.03092, Adjusted R-squared: 0.01421 F-statistic: 1.851 on 1 and 58 DF, p-value: 0.179 > anova(mod0,modaf) Model 1: PdsGrains ~ 1 Model 2: PdsGrains ~ Fertilisa Res.Df RSS Df Sum of Sq F Pr(>F) 1 59 4712.3 2 58 4566.6 1 145.7 1.8506 0.179 > anova(modaf) Response: PdsGrains Df Sum Sq Mean Sq F value Pr(>F) Fertilisa 1 145.7 145.704 1.8506 0.179 Residuals 58 4566.6 78.734 12