
The FTest used in the hypothesis testing of variances (not means) as in ANOVA.
The FTest assumes a normally distribution, as well as Bartlett's Test. The samples should exhibit normal conditions within each set of experiments (or trials).
Levene's Test is similar, but is used when analyzing data that is not from (or can not be assumed) a normal distribution. This test can be used for any continuous distribution to compare variances.
The value of F represents the ratio of two variances, and comparing the Ftest value to the Fcritical value is used to make a decision on the null hypothesis.
It is used to compare:
In ANOVA, the value of F is the ratio of treatment variance to the error variance.
Remember that is not acceptable to try and make a decision by simply looking at the data in numerical format to determine if there is a statistical difference (whether testing for difference in means, median, or variances). Nor should a statistical decision be concluded based on a graph or visual model of the data such as a box plot.
These tools can provide a very good idea of the final result but a Six Sigma project manager must base conclusions from statistical results and provide the team members the practical results in terms they can best understand.
Create a visual representation of the test and start with practical study and work your way though the statistical study.
The two sets of data must be statistically independent.
There are a couple of methods to get a statistical conclusion:
1) Compare F observed value from the two samples to the Fcritical value.
OR
2) Use the pvalue. Reject the null and infer the alternative if the pvalue < alpha risk.
For the first option:
In other words, the test is significant if the Fobserved (calculated) value is greater than the Fcritical value. The Fcritical values can also be found in tables that have the most common values for alpha risk and degrees of freedom, dF.
There is an example below of how to use the Ftable.
NOTE:
The "Fobserved" value is also referred to as the "Fcalculated" value.
Shown below is a set of BEFORE and AFTER data on Moving Range chart of a normally distributed data set both before and after. The visual indicators show that there is not likely a change in the variation but it must be statistically verified.
Another visual indicator to compare variances is done by comparing the overlap in charts below for the BEFORE and AFTER data. Usually if the dots are within each of the other's alphavalue confidence interval, there is likely not a statistical difference.
When comparing two samples (such as above), between the Levene's Test and the Ftest, the Ftest is more dependent on the assumption of normality and is a more accurate test when the data is actually normal.
Ftest results are used because both samples (subgroups) are normal.
Notice from above, the pvalue of 0.202 is not less the 0.05 (or 95% confidence level) so therefore failed to reject the null hypothesis and therefore confirmed that there is not a difference in the variation BEFORE and AFTER.
We can not conclude with 95% confidence that the variance changed from BEFORE and AFTER. The same test can be used to compare variance between two machines, two operators, two plants, etc (assuming data is normal).
Keep in mind this is only testing the variances. This does not indicate whether there was a statistical change in the mean from BEFORE and AFTER. Use the paired ttest to test a change in means of the group BEFORE and AFTER.
When comparing >2 samples, Bartlett's Test is more dependent on the assumption of normality and is a more accurate test when the data is actually normal.
Assume the alpha risk chosen is 0.05. The dF for the numerator is 15 and the denominator is 10. Therefore, the Fcritical value = 2.85.
Other Ftables, ttables, and Chisquared tables are put together in one Excel file for members. Members will be able to download this file for reference upon logging in and going to the 'Member Offers' section.
Click here to learn more about becoming a member of a growing community and the options available. We are regularly adding new modules and free downloads for members.
The example below illustrates some uses and practical meaning of the F values within an ANOVA test.
The results of a mock study where four appraisers were timed to make an inspection decision on a 13 widgets.
Determine if there is a significant difference of means in two or more appraisers.
All other criteria are equal.
Since TIME is the only factor, this is a OneFactor or OneWay ANOVA. There are four levels that are controlled in the experiment, one being each appraiser.
The first step is to create the test. In general, if the pvalue is lower than the alpharisk then the alternate hypothesis is inferred (reject the null).
Hypothesis Test:
Null Hypothesis: Population means of the different appraisers are equal.
Alternate Hypothesis: One of the means are not the same
There are a total of 51 Degrees of Freedom computed from (13*4)  1.
Using a OneWay test with an alpharisk of 0.05, the pvalue is well above 0.05 at 0.847 (see results table below).
The Fstatistic, and heavily overlapping confidence intervals are also evidence that there is no difference among any pairs or combinations of them.
It is concluded that there is not a statistical difference between any of the appraisers.
What if?
If the pvalue was <0.05, then at least one group of data is different than at least one other group. It doesn't conclude which one...only states that at least one of the four is different than the others.
The low Fstatistic of 0.27 says the variation within the appraisers is greater than the variation between them. The Fcritical value is 2.81.
You can use the Ftable above to get a close estimate of the Fcritical value. One downfall with tables is sometimes you may not get a precise number since not every combination is shown. However, the table can provide a fairly good estimate and at least allow a decision to be very conclusive.
The numerator has 3 degrees of freedom and the denominator has 48 degrees of freedom. Using the table below shows that the Fcritical value is going to be between 2.76 and 2.84. And in this case, both values are much higher than the Fcalculated value of 0.27 so the conclusion is the same.
As a Six Sigma project manager it may be worth rerunning (depending on cost and time) the trial with a larger sample size and additional appraiser training to reduce the variation within each one.
The variation is fairly consistent among each of them so it appears there is a systemic issue that is causing nearly similar amounts of variation within each appraiser.
It is possible that one or a few of the widgets are creating the similar spread in the timing for each appraiser. You may examine the timing performance of each widget and run an ANOVA among the 13 widgets and see if one or more stands out.
Epsilonsquared is the % of variation related to the Factor, which is the Appraiser. This is 4.84 / 291.69 = 0.01659 = 1.7%. This is a low value so it is possible that other Factors exist that are creating the variation.
Six Sigma
Six Sigma Modules
The following presentations are available to download.
Green Belt Program (1,000+ Slides)
Basic Statistics
SPC
Process Mapping
Capability Studies
MSA
Cause & Effect Matrix
FMEA
Multivariate Analysis
Central Limit Theorem
Confidence Intervals
Hypothesis Testing
T Tests
1Way Anova Test
ChiSquare Test
Correlation and Regression
SMED
Control Plan
Kaizen
Error Proofing