Power and Sample Size
Description
Show the relationship between Power and Sample Size. The Power of the comparison test refers to the likelihood the decision is made that there is a significant difference when it actually exist.
Objective
The Power of a test determines if there is enough sensitivity in the test to detect actual (true) differences. Understand that more Power and sample size is needed to detect smaller differences and the Power quantifies the smallest difference the comparison test is capable of detecting.
Power = 1 - Beta Risk (Type II Error)
Confidence Level = 1 - Alpha Risk (Type I Error)

How to read the table shown above:
The first row of the table indicates that as the probability of a Type I error increases (Alpha Risk), the Power increases and the probability of a Type II error (Beta Risk) decreases.
In other words, as the Producer is willing to reject more non-defective parts to ensure the defective parts are rejected then the probability of Consumers getting any defects is reduced. This becomes more "powerful" in protecting the Consumers, perhaps the most important risk to protect.
Sample Size
Collecting data consumes time and resources; there is a tangible cost. It is important to collect enough data to detect the difference required but not create waste by collecting excess data.
The level of power needed should be determined by the GB/BB/MBB or combination along with input from the team. This value is normally higher as the application becomes more critical. Life dependent, regulatory, and safety applications would require higher levels of power.
Beta should be no higher than 5% to allow for a minimum Power level of 95% in critical applications. For example, choosing a Power of 99%, means that you are willing to accept a 1% chance of having Beta risk. There is a 1% chance that a decision is made that no parts are defective but there is are defective parts and the consumer will suffer.
Determining the level of Power is the starting point in determining the amount of samples to be collected. And getting this quantity of samples add confidence to the test results and inferences to the population. Again, there is almost almost an argument to want to have near perfect Power 99.999...% because everyone may feel there test or application is of premium importance but this would require an impractical data amount of resources. The more samples collected and analyzed the stronger the Power will be to detect smaller differences.
Applications
Comparison Tests:
One sample Z test
One sample t test
Two sample t test
One sample proportion test
Two sample proportion test
One way ANOVA
Factorial/Fractional Design of Expirements (DOE)
Recall:
Type I Error = Alpha Risk = Significance Level = Producers Risk = False Positive.
This is when the decision is made that there is a difference when the truth is there is not. In other words, parts have been determined defective (possibly scrapped) and they were not defective. The Producer suffered by losing stock and needing to make up the lost inventory.
Type II Error = Beta Risk = Consumers Risk = False Negative
This is when the decision is made that there is not a difference when the truth is there is a difference. In other words, parts have been determined not defective and sent to the customer (or downstream operation) and they were defective. The Consumer suffered by receiving defects.
Return to BASIC STATISTICS
Return to the ANALYZE Phase
Find more materials related to Power and Sample Size
Return to the Six-Sigma-Material Home Page from Power and Sample Size

|