The stem and leaf plot is used to display in a bar chart format categories or variable data. The stems are groups of data by class intervals. The leaves are smaller increments of each data point that are built onto the stems
They were most popular in the 1980's and were often done by hand with manageable amounts of data. Since most software programs can handle large amounts of data, there are more informational types of graphical methods used. The stem and leaf plot is rarely used today.
This plot can be created where the stem is shown as the left column and the leaves in the right column which would show the distribution in a "horizontal" manner. Or it can be created where the stem is created along the bottom and the leaves built on top which would show the distribution in a "vertical" manner. Either method will get the same result just shown in a different orientation.
Plot the following 41 data points and make observations on the distribution: The data was collected but not necessary in order by sequence.
56, 37, 52, 43, 131, 48, 84, 78, 134, 88, 137, 64, 123, 141, 148, 56, 79, 80, 97, 67, 60, 61, 86, 67, 52, 71, 70, 82, 52, 70, 64, 128, 139, 142, 72, 127, 131, 137, 77, 150,79.
It appears there is two modes and both modes taking on the shape of a normal distribution which is referred to as bimodal. It doesn't appear there are any outliers if there are two modes occurring.
The team should investigate how the data is getting recorded, who, machines, parts, measuring devices, and other inputs that are leading to this. By looking at only the numerical data it is not as obvious to see, but once it is graphed it is simple to spot and fix early in the process.
It doesn't matter that each data point is plotted in sequence, any order will still give the shape the same appearance at the end. However what is required to give the shape its proper scaling, is the entry of the leaves must take the same amount of space. For example, if the amount of spacing between the 3 and 8 (to the right of the stem 4) was extended due to careless recording on a board then it might give the wrong appearance of the shape.
3,8 looks different than 3, 8.
Since these are typically done by hand, it can become time consuming and messy with large sets of data. Remember each data point is represented numerically.
A histogram can lose the individual values of the data where as this plot retains most of (often all) the raw numerical data. These plots can provide a quick snapshot on the distribution of the data and expose outliers. Another advantage is the stem and leaf plot shows at leas two significant digits.
The other method of creating the stem and leaf plot is shown with the same data.
Jul 17, 16 12:12 AM
Proper data classification is necessary to select correct statistical tools
Jun 22, 16 07:13 PM
Description of the 7-Wastes, also called Muda
Feb 03, 16 10:43 PM
Determing the process capability indices, Pp, Ppk, Cp, Cpk, Cpm
Six Sigma Modules
The following presentations are available to download.
Green Belt Program (1,000+ Slides)
Cause & Effect Matrix
Central Limit Theorem
1-Way Anova Test
Correlation and Regression