ANOVA is used to determine if there are differences in the mean in groups of continuous data. It answers the question...Is the mean of at least one group different than the mean of other (multiple) groups of data?
The test is used in the ANALYZE phase of a DMAIC project. A GB/BB should be very comfortable understanding the mechanics behind this test. It's likely to be one of the most common test you will use as a Six Sigma project manager.
ANOVA is a commonly used as a hypothesis test for means (not median or mode) and usually is applied for testing >2 means (use 1-sample t or 2-sample t test for one or two means testing respectively).
ANOVA uses two components of variance and the F test to test the two components:
BETWEEN sample variance is a study of the variation among all the samples usually due to process difference or factor changes.
WITHIN sample variance explains the variation within each sample itself (look at a Box Plot of one data set to graphically comprehend this - the tip of one whisker to another).
ANOVA answers the question if the means of several populations are statistically different or equal. It also computes a lot of other valuable insight that can help steer a GB/BB in a clearer direction. A statistical difference is found when the difference BETWEEN samples is large enough "relative to the difference WITHIN the samples.
The t-test are limited to comparing up to just two groups. Where as, ANOVA can compare 3 groups, 15 groups, 25 groups, and more. Using ANOVA to compare two sample means is equivalent to using a t-test to compare the means of independent samples.
Factor (Process Input Variable - PIV, x): A controlled or
uncontrolled variable (independent variable) whose influence is being
Factor Level (+1,-1, Hi, Low, + , - , A, B): Factor setting.
Response (Process Output Variable - POV, y): The output of the process.
Inference Space: Range of the factors being evaluated.
Fit: Predicted value of the POV (y) with a specified setting of factors.
Residual: Difference from the fit and actual experimental output.
The following illustrates how the hypothesis test is written along with comments:
Ho: Mean 1 = Mean 2 = Mean 3 = Mean n
where n = number of samples or levels or samples
HA: at least 1 Mean is different from the other Means
(read that carefully....it is possible that only one sample mean is different from the other 3, 50, or 100 sample means. Removing the one sample could completely change the result of the test. That is why visual depiction, such as Box Plots, can help find the drivers to the test result or samples that are flawed).
If the Null Hypothesis, Ho, is found to be true, then we would not expect
to see a lot of variation Between Samples. All the population means are considered equal.
If Ho is not true, expect to see significant variation between the samples. This would imply that the difference between samples is large relative to the variation within samples.
Reminder: Statistical significance does not always imply practical significance. Every numerical result needs to taken under scrutiny to determine if it makes sense in reality.
FOLLOW THESE STEPS for One-Way ANOVA:
* If the p-value is less than a, reject Ho and infer HA. If the p-value is greater than a, fail to reject the Ho
In a completely randomized design (One-way ANOVA) there is only one independent variable (factor or "x") with >2 treatment levels (you could also use this for two levels) also called classifications. The sample sizes do not have to be equal.
if there is a significant difference of means in two or more appraisers.
The results of a mock study where four appraisers were timed to make an
inspection decision on a 13 widgets.
All other criteria are equal.
Since TIME is the only factor, this is a One-Factor or One-Way ANOVA. There are four levels that are controlled in the experiment, one being each appraiser.
The first step is to create the test. In general, if the p-value is lower than the alpha-risk then the alternate hypothesis is inferred (reject the null).
Null Hypothesis: Population means of the different appraisers are equal.
Alternate Hypothesis: One of the means are not the same
There are 51 Degrees of Freedom computed from (13*4) - 1.
Using a One-Way test with an alpha-risk of 0.05, the p-value is well above 0.05 at 0.847 (see results table below).
The F-statistic, and heavily overlapping confidence intervals are also evidence that there is no difference among any pairs or combinations of them.
It is concluded that there is not a statistical difference between any of the appraisers.
If the p-value was <0.05, then at least one group of data is different than at least one other group. It doesn't conclude which one...only states that at least one of the four is different than the others.
Paul has the lowest average time per appraisal but Jim has lowest variation and the most consistent time for each appraisal. What this result doesn’t say is if the appraisals are correct!
With these results a Six Sigma Project Manager would likely be very pleased that all are performing the same in terms of time spent making an appraisal and the variation from appraisal to appraisal is similar among each person (hopefully the correct appraisal too).
This is likely a result of consistent training and adherence to the SOP's. However, the next questions from the Six Sigma Project Manager is....can this be improved or is it acceptable?
Caution: It still may be possible that 19-20 seconds per appraisal is not acceptable by the company, or customer, and this still needs to be reduced. This One-Way ANOVA only indicates that there is not a statistical difference among the appraisers times. This test is not comparing the appraisers to a target value.
The low F-statistic of 0.27 says the variation within the appraisers is greater than the variation between them.
The F-critical value is 2.81 according to the statistical software (not shown above).
You can use the F-table above to get a close estimate of the F-critical value. One downfall with tables is sometimes you may not get a precise number since not every combination is shown. However, the table can provide a fairly good estimate and at least allow a decision to be very conclusive.
The numerator has 3 degrees of freedom and the denominator has 48 degrees of freedom. Using the table below shows that the F-critical value is going to be between 2.76 and 2.84. And in this case, both values are much higher than the F-calculated value of 0.27 so the conclusion is the same.
As a Six Sigma project manager it may be worth re-running (depending on cost and time) the trial with a larger sample size and additional appraiser training to reduce the variation within each one.
The variation is fairly consistent among each of them so it appears there is a systemic issue that is causing nearly similar amounts of variation within each appraiser.
It is possible that one or a few of the widgets are creating the similar spread in the timing for each appraiser. You may examine the timing performance of each widget and run an ANOVA among the 13 widgets and see if one or more stands out.
Epsilon-squared is the % of variation related to the Factor, which is the Appraiser. This is 4.84 / 291.69 = 0.01659 = 1.7%. This is a low value so it is possible that other Factors exist that are creating the variation.
Depending on the version of Excel there is an “Analysis ToolPak” add-in module that may be needed. In this version depicted below it is called "XLMiner Analysis ToolPak". Go to the INSERT tab in this case (or could be under TOOLS).
Type in ANOVA and click on the magnifying glass, the the XLMiner option will appear. Select ADD and the menu will pop up as shown on the right of the picture below.
As you can see, there are several statistical tool to choose from. In this case, select ANOVA: Single Factor
The following data was recorded across five machines. The team recorded the pieces per minute that were produced of the same PN 123XYZ under similar operating conditions and had to be acceptable pieces. They wanted to examine several things with one of them being if any of the machines mean performance varied from the other. Assume 95% Confidence Level.
For example, the first time that Machine 1 ran a batch of PN 123XYZ it averaged 210 acceptable pieces/minute. Recall, that sample sizes do not have to be the same.
Interpreting the results
Other factors can be added to this type of test and get more complicated
but most statistical software programs can run Two-Way and Three-Way
ANOVA. Use Two-Way ANOVA when there are two factors.
Two-Way Hypothesis Tests:
Null Hypothesis: There is no difference in the means of the 1st factor
Null Hypothesis: There is no difference in means of the 2nd factor
Null Hypothesis: There is no interaction between the two factors
Alternate Hypothesis: Means are not equal among the levels of the 1st factor
Alternate Hypothesis: Means are not equal among the levels of the 2nd factor
Alternate Hypothesis: There is an interaction between the two factors
When there are 3 or more factors use ANOVA General Linear Model.
This module provides lessons and more detail about One-Way
ANOVA. Understanding the basic meaning and applications for this
commonly used test is necessary for any level of a Six Sigma Project
Keep in mind that One-Way ANOVA (and the t-tests) are comparing 1 FACTOR across multiple groups. The t-test compares 1 FACTOR across one or two groups (such as before/after, or two machines, or two operators, or now/past)
A multivariate analysis is a tool that evaluates differences among 2 or MORE FACTORS and between multiple groups simultaneously. There are Two-Way and Three-Way ANOVA tools as well but again those are limited to 2 & 3 factors respectively.
Factors are differences in things such as, but not limited to, parts produced (its probably not a good idea to compare the production of pencils to the production of nails even if they run on similar machines), services delivered, time, different operating conditions, and customer requirements.
Before jumping into a multivariate analysis, use ANOVA to focus on one factor at a time and learn from that analysis first, then use multivariate if something significant is found.
Once the data is collected the ANOVA takes very little time and evaluating the factors in various ways only provides more and more insight as to their relationship. It is always better to have more than enough information, within reason, than not enough, especially when the analysis only takes a few minutes.
Six Sigma Certification
Six Sigma Modules
Green Belt Program (1,000+ Slides)
Cause & Effect Matrix
Central Limit Theorem
1-Way Anova Test
Correlation and Regression