Inferential Statistics Part III – ANOVA

07-Apr-18 . 9 mins read
R statistics

Photo by Tisma Jrdl on Unsplash


In the previous post, we looked at T-tests to compare the means of one or two samples. The T-tests can still be used for more than two samples but there are two issues with it :

The ANOVA methods were developed by Ronald Fisher as an extension for t and z tests. They measure the between-group variability vs the within-group variability. They can be used to compare two or more groups and find if there is a relationship that exists between them.

There are 2 types of ANOVA tests

One-way ANOVA

  • The ANOVA is different from other tests as we have to compute different measures and then use them to calculate the F-score.
  • The measures that need to be computed for updating the table are given below
  • Total Sum of Squares \(SS_{T}\) is the sum of the difference between each value y from the grand mean for N observations \(SS_{T} = \sum (y - \bar{y}^2) = \sum y^2 - \frac{(\sum y)^2}{N}\)
  • Sum of Squares Between \(SS_{B}\) for k groups, \(n_{k}\) observations in group k and \(\bar y_{k}\) being the mean of the group k is given by \(SS_{B} = \sum n_{k} (\bar y_{k} - \bar {y})^2\)
  • Sum of Squares Within groups \(SS_{W} = SS_{T} - SS_{B}\)
  • The degrees of freedom are given by
    • \(df_{Total} = N - 1\)
    • \(df_{Between} = k - 1\)
    • \(df_{Within} = N - k\)
  • We then calculate the mean square error with the associated degrees of freedom.
    • \(MS_{B} = \frac {SS_{B}}{df_{Between}}\) measures between-group variability
    • \(MS_{W} = \frac {SS_{W}}{df_{Within}}\) measures variability within each of the groups
  • And finally the F statistic is the ratio \(\frac {MS_{B}}{MS_{W}}\)
  • When the null hypothesis is true any difference among the sample means are only due to chance and MSB and MSW should be equal
  • F will be larger when \(MS_{B}\) will be larger than \(MS_{W}\), indicate a strong evidence against the null hypothesis. If there is no difference between the groups it will be close to 1 (accept null hypothesis)

Two way ANOVA

  • In one-way ANOVA, we had one dependent variable and one independent variable.
  • In two way ANOVA we can have more than one independent variables, so we need to calculate a ratio that measures not only the variation between the dependent and independent variables, but also the interaction between the two independent variables.

Post-hoc analysis

  • The ANOVA test gives us the significant variables for which the group means are different but if we have to know which pairs of the groups are different then we need to run post-hoc analysis
  • Couple of methods of post-hoc analysis are Tukey Honest Significant Differences and Bonerroni post-hoc analysis

Example

  • We will use the ToothGrowth dataset in R.
    • The dataset has the observations of the tooth growth in 60 guinea pigs which were administered three doses of Vitamin C (0.1, 1 and 2 mg/day).
    • It also has two supplement types using which these doses were administered – OJ (Orange Juice) and VC (ascorbic acid which is a form Vitamin C)
  • A quick look at the boxplot indicates there are couple of outliers but we will ignore them. It also indicates that Dose 2 resulted in longer tooth growth and so also Dose 1 given in Orange Juice
str(ToothGrowth)
## 'data.frame':    60 obs. of  3 variables:
##  $ len : num  4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
##  $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
##  $ dose: num  0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
ToothGrowth$dose <- as.factor(ToothGrowth$dose)

ggplot(data=ToothGrowth) +
 geom_boxplot(aes(x=dose,y=len,fill=supp)) + 
   labs(subtitle="Box plot of tooth growth for various doses"
   ,caption = varPlotCaption) +
   theme_darklightmix(color_theme = ggplot_color_theme) +
   scale_fill_brewer(palette = "Set3")

We will run one-way ANOVA on dosage and supplement separately and then run the two-way ANOVA with both of them together with their interaction

One-way ANOVA on dose

  • Null hypothesis :
    • Mean tooth growth for the 3 doses is same
  • Intepretation of the results
    • p-value < 0.001 :
      • Reject the null hypothesis i.e
      • Mean tooth growth is not the same ie different doses have different effect on the tooth growth
    • F Statistic of 105 :
      • Much higher than the critical value of 12.01 at p=0.05
      • ie between group variability (MSB) is much higher than within group variability (MSW)
    • TukeyHSD :
      • Mean tooth growth is higher for larger doses
summary(aov(len~dose,data=ToothGrowth))
##             Df Sum Sq Mean Sq F value               Pr(>F)    
## dose         2   2426    1213   67.42 0.000000000000000953 ***
## Residuals   57   1026      18                                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))
## [1] "Critical value of F Statisic: 4.00687288633273"
TukeyHSD(aov(len~factor(dose),data=ToothGrowth))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = len ~ factor(dose), data = ToothGrowth)
## 
## $`factor(dose)`
##         diff       lwr       upr     p adj
## 1-0.5  9.130  5.901805 12.358195 0.0000000
## 2-0.5 15.495 12.266805 18.723195 0.0000000
## 2-1    6.365  3.136805  9.593195 0.0000425

One-way ANOVA on supplement type

  • Null hypothesis :
    • Mean tooth growth for the 2 supplement types is same
  • Interpretation of results
    • p-value > 0.05 :
      • Accept the null hypothesis i.e
      • Mean tooth growth is the same ie different supplement types have no effect on the tooth growth
    • F Statistic of 3.668 :
      • Lower than the critical value of 4 at p=0.05
      • ie between group variability (MSB) is slightly higher than within group variability (MSW)
    • TukeyHSD :
      • No TukeyHSD as the null hypothesis is not rejected and hence no need to quantify the effects
summary(aov(len~supp,data=ToothGrowth))
##             Df Sum Sq Mean Sq F value Pr(>F)  
## supp         1    205  205.35   3.668 0.0604 .
## Residuals   58   3247   55.98                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))
## [1] "Critical value of F Statisic: 4.00687288633273"

Two-way ANOVA on both

  • Null hypothesis :
    • Mean tooth growth for the 3 doses is same
    • Mean tooth growth for the 2 supplement types is same
    • The interaction between the dose and supplement type has no effect on the tooth growth
  • Interpretation of results
    • p-values :
      • Dose has p-value < 0.05
        • Reject the null hypothesis that the 3 doses have the same effect on tooth growth ie
        • The 3 doses are significantly different as seen in the One-Way ANOVA
      • Supplement type now has p-value < 0.05
        • Reject the null hypothesis that the 2 supplement types have same effect on tooth growth ie
        • The supp type now has significant effect on the tooth growth, after controling for the level of dose and the interaction effect dose * supp
      • dose:supp has p-value < 0.05
        • Reject the null hypothesis
        • If the significant value was chosen at say 0.02 then we could have accepted the null hypothesis ie the interaction has no effect on tooth growth
    • TukeyHSD :
      • dose
        • The interpretaton for dose is same as that in One-way ANOVA
      • supp :
        • The mean tooth growth of VC is lower than that achieved by OJ (diff of -3.7)
        • Even though in this case the Tukey HSD is not needed as there are only two factor levels
      • dose*supp :
        • We will look at couple of cases
        • 0.5:VC-2:OJ : The mean tooth growth due to 0.5:VC is much lower than 2:OJ and the effect is significant with p-adj < 0.05
        • 2:VC-1:OJ : The diff in mean tooth growth is 3.44 with a p adj of > 0.05 indicating that they have the same/similar effect on tooth growth
summary(aov(len~dose*supp,data=ToothGrowth))
##             Df Sum Sq Mean Sq F value               Pr(>F)    
## dose         2 2426.4  1213.2  92.000 < 0.0000000000000002 ***
## supp         1  205.4   205.4  15.572             0.000231 ***
## dose:supp    2  108.3    54.2   4.107             0.021860 *  
## Residuals   54  712.1    13.2                                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#paste("Critical value of F Statisic:",qf(df1=1,df2 = 58, p= 0.05,lower.tail=F))

TukeyHSD(aov(len~factor(dose)*supp,data=ToothGrowth))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = len ~ factor(dose) * supp, data = ToothGrowth)
## 
## $`factor(dose)`
##         diff       lwr       upr     p adj
## 1-0.5  9.130  6.362488 11.897512 0.0000000
## 2-0.5 15.495 12.727488 18.262512 0.0000000
## 2-1    6.365  3.597488  9.132512 0.0000027
## 
## $supp
##       diff       lwr       upr     p adj
## VC-OJ -3.7 -5.579828 -1.820172 0.0002312
## 
## $`factor(dose):supp`
##                 diff        lwr         upr     p adj
## 1:OJ-0.5:OJ     9.47   4.671876  14.2681238 0.0000046
## 2:OJ-0.5:OJ    12.83   8.031876  17.6281238 0.0000000
## 0.5:VC-0.5:OJ  -5.25 -10.048124  -0.4518762 0.0242521
## 1:VC-0.5:OJ     3.54  -1.258124   8.3381238 0.2640208
## 2:VC-0.5:OJ    12.91   8.111876  17.7081238 0.0000000
## 2:OJ-1:OJ       3.36  -1.438124   8.1581238 0.3187361
## 0.5:VC-1:OJ   -14.72 -19.518124  -9.9218762 0.0000000
## 1:VC-1:OJ      -5.93 -10.728124  -1.1318762 0.0073930
## 2:VC-1:OJ       3.44  -1.358124   8.2381238 0.2936430
## 0.5:VC-2:OJ   -18.08 -22.878124 -13.2818762 0.0000000
## 1:VC-2:OJ      -9.29 -14.088124  -4.4918762 0.0000069
## 2:VC-2:OJ       0.08  -4.718124   4.8781238 1.0000000
## 1:VC-0.5:VC     8.79   3.991876  13.5881238 0.0000210
## 2:VC-0.5:VC    18.16  13.361876  22.9581238 0.0000000
## 2:VC-1:VC       9.37   4.571876  14.1681238 0.0000058

Inference

  • dose has the most significant effect on tooth growth
  • supplement type on its own doesn’t have much effect but its effect increases when combined with the dose variable
  • Thus changing supplement methods or the dose of vitamin C, will significantly impact the tooth growth

ANOVA Assumptions

  • ANOVA has three assumptions
    • All observations are independent of one another and randomly selected from the population which they represent
    • The variance across groups must be almost the same(homoscedasticity)
    • The distribution should be approximately normal at each factor level

The homogeneity of the variances can be verified using the residuals vs fitted plot which shows that the residuals are uniformly distributed around the x axis and that there is no relationship between residuals and fitted values. It can also be checked using car::leveneTest()

plot(aov(len~dose*supp,data=ToothGrowth),1)

The normality assumption can be checked using the QQ plot which shows that the residuals are along the diagonal line It can also be verified using the Shapiro Wilk test. If it failed, then we would do transformation of the dependent variable.

plot(aov(len~dose*supp,data=ToothGrowth),2)

shapiro.test(x = residuals(aov(len~dose*supp,data=ToothGrowth)))
## 
##  Shapiro-Wilk normality test
## 
## data:  residuals(aov(len ~ dose * supp, data = ToothGrowth))
## W = 0.98499, p-value = 0.6694

Unbalanced ANOVA

  • The ToothGrowth dataset was balanced the number of observations in each group were equal
  • If it was not balanced then we would use car::Anova()
table(ToothGrowth$supp,ToothGrowth$dose) %>%
kable(format.args = list(decimal.mark = '.', big.mark = ",")) %>%
kable_styling(bootstrap_options = "condensed"
            ,full_width = FALSE
            ,position = "center"
            ,font_size = 10)
0.5 1 2
OJ 10 10 10
VC 10 10 10

Summary

  • The null hypothesis for ANOVA is that the mean of the dependent variable is the same for all groups
  • Further learning

Geographic Data and Visualisation in R

Kaggle Porto Seguro Part I - Exploratory Data Analysis

Linear Regression Part III - Plots

comments powered by Disqus