The F-Test used in the hypothesis testing of variances (not means) as in ANOVA.

The F-Test assumes a normally distribution (parametric test), as well as Bartlett's Test. The samples should exhibit normal conditions within each set of experiments (or trials). The F test is very sensitive to non-normality; therefore, the data from the measurement sets must be normal assumptions (not just be >30 samples). 

Levene's Test and the Brown-Forsythe Test are non-parametric test for variance when analyzing data that is not from (or cannot be assumed) a normal distribution. In other words, the data doesn't meet the assumptions for the F (or t) test. These tests can be used for any continuous distribution to compare variances. 

The value of F represents the ratio of two variances, and comparing the F-test value to the F-critical value is used to make a decision on the null hypothesis.

It is used to compare:

  • 1 Sample Variance to a Target
  • 2 Sample Variances
  • >2 Variances using ANOVA

In ANOVA, the value of F is the ratio of treatment variance to the error variance.

2 Sample Variance Comparison

Remember that is not acceptable to try and make a decision by simply looking at the data in numerical format to determine if there is a statistical difference (whether testing for difference in means, median, or variances).  Nor should a statistical decision be concluded based on a graph or visual model of the data such as a box plot.

These tools can provide a very good idea of the final result but a Six Sigma project manager must base conclusions from statistical results and provide the team members the practical results in terms they can best understand. 

Create a visual representation of the test and start with practical study and work your way though the statistical study. 

The two sets of data must be statistically independent. The F test is used to test if the variances of both populations are equal. The HA below shows a two-tailed alternative hypothesis but HA could be one-tailed (such as the variance of one population being > or < the variance of the other population.

There are a couple of methods to get a statistical conclusion:

1) Compare F observed value from the two samples to the F-critical value.


2)  Use the p-value. Reject the null and infer the alternative if the p-value < alpha risk. 

For the first option: 

In other words, the test is significant if the F-observed (calculated) value is greater than the F-critical value. The F-critical values can also be found in tables that have the most common values for alpha risk and degrees of freedom, dF

There is an example below of how to use the F-table.


The "F-observed" value is also referred to as the "F-calculated" value. 

Shown below is a set of BEFORE and AFTER data on Moving Range chart of a normally distributed data set both before and after. The visual indicators show that there is not likely a change in the variation but it must be statistically verified. 

Example of visual indicators

Another visual indicator to compare variances is done by comparing the overlap in charts below for the BEFORE and AFTER data. Usually if the dots are within each of the other's alpha-value confidence interval, there is likely not a statistical difference. 

Test for Equal Variances, F-Test

When comparing two samples (such as above), between the Levene's Test and the F-test, the F-test is more dependent on the assumption of normality and is a more accurate test when the data is actually normal.

F-test results are used because both samples (subgroups) are normal.

Notice from above, the p-value of 0.202 is not less the 0.05 (or 95% confidence level) so therefore failed to reject the null hypothesis and therefore confirmed that there is not a difference in the variation BEFORE and AFTER.

We can not conclude with 95% confidence that the variance changed from BEFORE and AFTER. The same test can be used to compare variance between two machines, two operators, two plants, etc (assuming data is normal).

Keep in mind this is only testing the variances. This does not indicate whether there was a statistical change in the mean from BEFORE and AFTER. Use the paired t-test to test a change in means of the group BEFORE and AFTER.

When comparing >2 samples, Bartlett's Test is more dependent on the assumption of normality and is a more accurate test when the data is actually normal.

Using the F-table

Assume the alpha risk chosen is 0.05. The dF for the numerator is 15 and the denominator is 10. Therefore, the F-critical value = 2.85.

Other F-tables, t-tables, and Chi-squared tables are put together in one Excel file for members. Members will be able to download this file for reference upon logging in and going to the 'Member Offers' section.

Click here to learn more about becoming a member of a growing community and the options available. We are regularly adding new modules and free downloads for members. 

ANOVA example showing F values

The example below illustrates some uses and practical meaning of the F values within an ANOVA test. 

The results of a mock study where four appraisers were timed to make an inspection decision on a 13 widgets. 

Determine if there is a significant difference of means in two or more appraisers. 

All other criteria are equal.

Since TIME is the only factor, this is a One-Factor or One-Way ANOVA. There are four levels that are controlled in the experiment, one being each appraiser.

The first step is to create the test. In general, if the p-value is lower than the alpha-risk then the alternate hypothesis is inferred (reject the null).

Hypothesis Test:

Null Hypothesis: Population means of the different appraisers are equal.
Alternate Hypothesis: One of the means are not the same

There are a total of 51 Degrees of Freedom computed from (13 * 4) - 1.

Using a One-Way test with an alpha-risk of 0.05, the p-value is well above 0.05 at 0.847 (see results table below).

The F-statistic, and heavily overlapping confidence intervals are also evidence that there is no difference among any pairs or combinations of them.

It is concluded that there is not a statistical difference between any of the appraisers.

What if?

If the p-value was <0.05, then at least one group of data is different than at least one other group. It doesn't conclude which one...only states that at least one of the four is different than the others. 

One Way Anova in MinitabOne-Way ANOVA

The low F-statistic of 0.27 says the variation within the appraisers is greater than the variation between them. The F-critical value is 2.81. 

You can use the F-table above to get a close estimate of the F-critical value. One downfall with tables is sometimes the table may not provide a precise number since not every combination is shown. However, the table can provide a fairly good estimate and at least allow a decision to be very conclusive.

The numerator has 3 degrees of freedom and the denominator has 48 degrees of freedom. Using the table below shows that the F-critical value is going to be between 2.76 and 2.84. And in this case, both values are much higher than the F-calculated value of 0.27 so the conclusion is the same. 

As a Six Sigma project manager it may be worth re-running (depending on cost and time) the trial with a larger sample size and additional appraiser training to reduce the variation within each one.

The variation is fairly consistent among each of them so it appears there is a systemic issue that is causing nearly similar amounts of variation within each appraiser.

It is possible that one or a few of the widgets are creating the similar spread in the timing for each appraiser. You may examine the timing performance of each widget and run an ANOVA among the 13 widgets and see if one or more stands out. 

Epsilon-squared is the % of variation related to the Factor, which is the Appraiser. This is 4.84 / 291.69 = 0.01659 = 1.7%. This is a low value, so it is possible that other Factors exist that are creating the variation. 

Return to the ANALYZE phase


Test your Six Sigma knowledge with this practice exam

Subscribe to access all pages within this site

Templates and Calculators

Return the Six-Sigma-Material Home Page

Custom Search

Site Membership

Six Sigma

Templates, Tables & Calculators

Six Sigma Certification

Six Sigma Black Belt Certification

Six Sigma Slides


Green Belt Program (1,000+ Slides)

Basic Statistics

Cost of Quality


Process Mapping

Capability Studies



Cause & Effect Matrix


Multivariate Analysis

Central Limit Theorem

Confidence Intervals

Hypothesis Testing

T Tests



Correlation and Regression

Control Plan



Project Pitfalls

Error Proofing

Z Scores


Takt Time

Line Balancing

Yield Metrics

Practice Exam

... and more

Statistics in Excel

Need a Gantt Chart?

Click here to get this template