Documented procedure for standardized and efficient data collection of the process, collecting data that will be used to describe the Voice of the Process (VOP).
Ensure data collection is complete, realistic, and practical. Many times this can be costly and resisted by those involved. Strive to minimize the costs and impact to those involved while obtaining as much accurate data in reasonable amount of time.
STEP 1: When setting up the collection system it is important to collect data only once and minimize the burden on the operators, team, and the GB/BB. Ensure to capture all the families of variation (FOV) that should be analyzed or relevant.
A sample Families of Variation diagram is shown below.
The entire amount of variation found in a set of data can be broken down
to the variation from the Process + the variation from the Measurement
System, which should be calculated from the MSA.
The %Study Variation in the MSA was found to be X, the Process Variation = 1 - X.
An FOV diagram starts drilling into the sources of Process Variation. These sources include:
Using statistical software, it is helpful to use the Multi-Vari charts to test and
eliminate FOV's. The goal is to determine which FOV(s) are contributing
the most variation to the process.
Another possibility is shown below. Each one of the families will contribute to the overall process performance. The combination (not necessarily the sum) of all their variances represents the overall process variance. These are all "short" term sources that make up the "long" term variation.
STEP 2: Complete a simple but comprehensive form that elaborates on the data and collection plan. This may later be used as an attachment used in the Control Plan when handing-off the project to the Process Owner.
Other items in addition to the FOV tree when completing the Data Collection Plan are:
Items to include:
the types of data and necessary sample sizes (observations) needed to
create control charts and hypothesis testing coming up later in the
MEASURE and ANALYZE phases. Meeting or exceeding the minimum (without be
too costly) can lead to better analysis and stronger decision making.
Data collected through automated methods often have the most accuracy
and bias. Even as simple as them seem to collect data it is important to
clearly identify the source, system, menus, files, folders, etc that
the information lies within. Any macros and adjustments and such should
be written out with clear step-by-step instructions.
These systems if not in place, can be timely to install in addition to being expensive. However, improving a data collection system can be a very successful part of the project improvement process in itself.
Manual methods require more instructions and training. Minimize the amount of people involved to reduce risk of introducing variation. The higher level of instruction and detail provided to the data collectors and appraisers (those collecting the measurements) will reduce the variation component contributed from the measurement system. This amount of error will be quantified and examine in the Gage R&R.
This is usually inexpensive to put in place and should be suited to fit exactly to what is needed. However, it can be costly in terms of being labor intensive and prone to recording errors and troubleshooting suspicious data.
Adding videotape and recordings are excellent ways to capture data and have the advantage of replay and removes uncertainty in what actually occurred.
Attribute data are fixed gauges that provide limited information but can be cheaper and quicker devices to obtain a decision that meets the Voice of the Customer. They are used to make decisions such as:
they don't tell how good or how bad the measurement is relative to
specification limits. Each decision is given the same weight but some GO
decisions may be actually better than others so that is where more
discrimination (or resolution) in the measurement system can help and
that comes with variable data.
Types of attribute measurement devices are:
These measurement devices provide more
information that attribute gauges and should be used to measure critical
characteristics at a minimum. They provide a measured dimension.
Types of variable measurement devices are:
Why is rational subgrouping important?
These represent small samples within the population that are obtained at similar settings (inputs or condition) over short period of time. In other words, instead of getting one data point on a short term setting, obtain 4-5 points and get a subgroup at that same setting and then move onto the next. This helps estimate the natural and common cause variation within the process.
Individual data (I-MR) is acceptable to measure control; however, it usually means that more data points (longer period of time) are necessary to ensure that all the true process variation is captured. The subgroup size = 1 when using I-MR charts.
Sometimes this can be purposely controlled and other times you may have to recognize it within data. Often times, a Six Sigma Project Manager will be given some data with no idea on how it was collected.
The data table below shows that 50 samples were collected and measured within 10 subgroups of 5 measurements each. The appraiser took five measurements at a particular moment (same tool, same operator, same machine, same short term time frame) and recorded measurements of the diameter of each part. This extra data (yes, it is more work and time) allows much stronger capability analysis rather than just collecting 10 data points, one reading from each part where a subgroup size = 1.
The x-bar value for each subgroup is the average reading of each subgroup and the range is easily calculated for each subgroup as the difference between the maximum and minimum value within each subgroup.
From here you can use the data to manually create x-bar control charts and calculate control limits. The average of the averages is 27.9083.
If there was a known standard value for the diameter, then the Gauge Bias can also be determined by taking 27.9083 minus the Standard Value. Assume the Standard (Reference) is 27.9050 then:
Gauge Bias = 27.9083 - 27.9050 = 0.0033 (this value is used when assessing gauge bias during a Stability review as part of MSA)
Another Example: Here we discover subgroups within the data (a good thing).
A Black Belt (BB) is provided data (different data than used above) from the team and begins to assess control. Without understanding the data and how it was collected, the BB generates the following Individuals chart indicating the Miles Per Gallon (MPG) of a vehicle from 23 observations.
The visual representation makes it clearer that there are likely subgroups within the data. The control chart appears to be out of control with a lot of special cause variation but there is likely a good explanation.
The BB talks to the team and learns that the MPG were gathered at different slopes of terrain. The higher MPG readings were achieved on downhill slopes and vice versa. The data is more appropriately shown below.
This shows each subgroup being in control. There were short term shifts in the inputs or conditions. Assessing normality or capability on the entire group of data is not meaningful since the inputs were purposely changed to gather data on different conditions. Therefore, this is not a "naturally" occurring process. There is going to be an appearance of "special cause" variation when in fact it is not.
Try to break down the data into the subgroups and analyze the data for normality and capability of each subgroup.
The next important measurement for someone looking at this data could be to understand those incline and decline measurements for each subgroup and determine the correlation between MPG and angle of incline or decline.
The point is to look for subgroups within the data and this provide a plausible explanation of what initially appears to be special cause variation.
Measurements are the basis for everything in quality systems for two primary reasons:
measurement there can not be objective proof or statistical evidence of
process control, shift, or improvement. It is critical to
get the most informative data that is practical and time
This data is required to understand the measurement system variation (done via MSA) and the process variation.
Once the MSA is concluded (and hopefully passed) the remaining variation is due to the process. The Six Sigma team should focus on reducing and controlling the process variation. However, sometimes fixing the measurement system itself can be a Six Sigma project if the potential is significant and the current system is so poor that is prohibits process capability analysis.
Gauges require care and regular calibration to ensure the tolerances are maintained and ensure the amount of variation they are contributing to the total variation is constant and not changing or that could affect decisions being made about the process variation (that are not actually occurring since the variation changes are stemming from the measurement devices).
Some gauges require special storage or have standard operating procedures for themselves. Any time there is suspected damage or a unique event that will use a device to a higher than usual degree, then a calibration should be done. Such as in a physical inventory event, all weigh scales should be calibrated prior to the event. All personnel should be trained on the same procedure and understand how the scales work.
Spend the time up front to remove the sources of variation and capture the most meaningful data that has sources of variation from only the process and not the measurement system. Try to capture the data using rational subgrouping across entire spectrum of sources of process variability.
Six Sigma Certification
Six Sigma Modules
Green Belt Program (1,000+ Slides)
Cause & Effect Matrix
Central Limit Theorem
1-Way Anova Test
Correlation and Regression