- Statistical tests for comparing variances
- Statistical hypotheses
- Import and check your data into R
- Compute Bartlett’s test in R
- Compute Levene’s test in R
- Compute Fligner-Killeen test in R
- Infos

This article describes **statistical tests** for comparing the **variances** of two or more samples. Equal variances across samples is called **hom*ogeneity** of **variances**.

Some statistical tests, such as two independent samples T-test and ANOVA test, assume that variances are equal across groups. The **Bartlett’s test**, **Levene’s test** or **Fligner-Killeen’s test** can be used to verify that assumption.

There are many solutions to test for the equality (**hom*ogeneity**) of variance across groups, including:

F-test: Compare the variances of two samples. The data must be normally distributed.

**Bartlett’s test**: Compare the variances of k samples, where k can be more than two samples. The data must be normally distributed. The Levene test is an alternative to the Bartlett test that is less sensitive to departures from normality.**Levene’s test**: Compare the variances of k samples, where k can be more than two samples. It’s an alternative to the Bartlett’s test that is less sensitive to departures from normality.**Fligner-Killeen test**: a non-parametric test which is very robust against departures from normality.

The **F-test** has been described in our previous article: F-test to compare equality of two variances. In the present article, we’ll describe the tests for comparing more than two variances.

For all these tests (**Bartlett’s test**, **Levene’s test** or **Fligner-Killeen’s test**),

- the null hypothesis is that all populations variances are equal;
- the alternative hypothesis is that at least two of them differ.

To import your data, use the following R code:

`# If .txt tab file, use thismy_data <- read.delim(file.choose())# Or, if .csv file, use thismy_data <- read.csv(file.choose())`

Here, we’ll use ToothGrowth and PlantGrowth data sets:

`# Load the datadata(ToothGrowth)data(PlantGrowth)`

To have an idea of what the data look like, we start by displaying a random sample of 10 rows using the function **sample_n**()[in **dplyr** package]. First, install dplyr package if you don’t have it: **install.packages(“dplyr”)**.

Show 10 random rows:

`set.seed(123)# Show PlantGrowthdplyr::sample_n(PlantGrowth, 10)`

` weight group24 5.50 trt212 4.17 trt125 5.37 trt226 5.29 trt22 5.58 ctrl14 3.59 trt122 5.12 trt213 4.41 trt111 4.81 trt121 6.31 trt2`

`# PlantGrowth data structurestr(PlantGrowth)`

`'data.frame': 30 obs. of 2 variables: $ weight: num 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ... $ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...`

`# Show ToothGrowthdplyr::sample_n(ToothGrowth, 10)`

` len supp dose28 21.5 VC 2.040 9.7 OJ 0.534 9.7 OJ 0.56 10.0 VC 0.551 25.5 OJ 2.014 17.3 VC 1.03 7.3 VC 0.518 14.5 VC 1.050 27.3 OJ 1.046 25.2 OJ 1.0`

`# ToothGrowth data structurestr(ToothGrowth)`

`'data.frame': 60 obs. of 3 variables: $ len : num 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ... $ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ... $ dose: num 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...`

Note that, R considers the column “dose” [in ToothGrowth data set] as a numeric vector. We want to convert it as a grouping variable (factor).

`ToothGrowth$dose <- as.factor(ToothGrowth$dose)`

We want to test the equality of variances between groups.

**Bartlett’s test** is used for testing hom*ogeneity of variances in k samples, where k can be more than two. It’s adapted for normally distributed data. The **Levene test**, described in the next section, is a more robust alternative to the Bartlett test when the distributions of the data are non-normal.

The R function **bartlett.test**() can be used to compute Barlett’s test. The simplified format is as follow:

`bartlett.test(formula, data)`

**formula**: a formula of the form values ~ groups**data**: a matrix or data frame

The function returns a list containing the following component:

**statistic**: Bartlett’s K-squared test statistic**parameter**: the degrees of freedom of the approximate chi-squared distribution of the test statistic.**p.value**: the p-value of the test

To perform the test, we’ll use the *PlantGrowth* data set, which contains the weight of plants obtained under 3 treatment groups.

**Bartlett’s test with one independent variable**:

`res <- bartlett.test(weight ~ group, data = PlantGrowth)res`

` Bartlett test of hom*ogeneity of variancesdata: weight by groupBartlett's K-squared = 2.8786, df = 2, p-value = 0.2371`

From the output, it can be seen that the p-value of 0.2370968 is not less than the significance level of 0.05. This means that there is no evidence to suggest that the variance in plant growth is statistically significantly different for the three treatment groups.

**Bartlett’s test with multiple independent variables**: the**interaction**() function must be used to collapse multiple factors into a single variable containing all combinations of the factors.

`bartlett.test(len ~ interaction(supp,dose), data=ToothGrowth)`

` Bartlett test of hom*ogeneity of variancesdata: len by interaction(supp, dose)Bartlett's K-squared = 6.9273, df = 5, p-value = 0.2261`

As mentioned above, Levene’s test is an alternative to Bartlett’s test when the data is not normally distributed.

The function **leveneTest**() [in **car** package] can be used.

`library(car)# Levene's test with one independent variableleveneTest(weight ~ group, data = PlantGrowth)`

`Levene's Test for hom*ogeneity of Variance (center = median) Df F value Pr(>F)group 2 1.1192 0.3412 27 `

`# Levene's test with multiple independent variablesleveneTest(len ~ supp*dose, data = ToothGrowth)`

`Levene's Test for hom*ogeneity of Variance (center = median) Df F value Pr(>F)group 5 1.7086 0.1484 54 `

The **Fligner-Killeen test** is one of the many tests for hom*ogeneity of variances which is most robust against departures from normality.

The R function **fligner.test**() can be used to compute the test:

`fligner.test(weight ~ group, data = PlantGrowth)`

` Fligner-Killeen test of hom*ogeneity of variancesdata: weight by groupFligner-Killeen:med chi-squared = 2.3499, df = 2, p-value = 0.3088`

This analysis has been performed using **R software** (ver. 3.2.4).