Skewness
Shape of the Distribution

Skewness describes the shape of the distribution along with another calculated value called kurtosis. Skewness is a measurement of symmetry where kurtosis is a measure of peakedness / flatness. Both are compared relative to the shape of a normal distribution.

The use of a histogram will give a quick visual indication of the skewness and kurtosis. A normal distribution has a skewness and kurtosis = 0 (Mesokurtic is the term for a kurtosis = 0).

If:

Sk > 0, then skewed right distribution
Sk = 0, then normal distribution
Sk < 0, then skewed left distribution

Purpose

Most Six Sigma projects involve numerous statistical tests that depend on making the proper decision on the behavior (shape and location) of the data, in many cases....determining whether the data meets normality assumptions.

A visualization and calculation of kurtosis and skewness is used to help make that decision. It is one of many tools (others include the p-value or the fat-pencil tests) to determine whether a distribution is normal.

Most software programs perform these calculations but each has a specific formula that may be important to understand if you are weighing heavily on it to make a decision.

If your data doesn't meet the assumptions for normality, there are tools such as a Box-Cox transformation, log, square root of data to make a data set more normal and the apply the normality tests to the transformed data.

Coefficient of Skewness

There are a few equations to determine the skewness of a distribution. A couple of the most common formulas are explained below.

The following examples apply the data shown below

Excel and Minitab (Equation #1)
Fisher - Pearson Skewness

where:

n = number of samples

s = standard deviation

x = each observation or data point

x-bar = average

Data Set 1 in Minitab

Summary:

The skewness is positive at a value of 1.003 which indicates a right-skewed shape. The histogram and curve creates a visualization which shows the data is highly unlikely to meet the assumptions of normality.

The p-value also is further evidence being very low and most likely much less than the chosen alpha-risk (most often = 0.05).

The importance of graphing is illustrated in this example. Looking at the data alone as numbers does not easily depict the potential bi-modal behavior that the histogram is showing. This requires further investigation from the Six Sigma project manager.

On another note:

The kurtosis is negative which is another characterization of the data distribution shape. When the value is negative the shape is referred to as Platykurtic which indicates flatter distribution that normal bell curve.

Data Set 2 in Minitab

Summary:

The skewness is negative at a value of -1.246 which indicates a left-skewed shape. The histogram and curve creates a visualization which shows the data is highly unlikely to meet the assumptions of normality.

The p-value is also further evidence being very low and most likely much less than the chosen alpha-risk (most often = 0.05).

On another note:

The kurtosis is positive which is another characterization of the data distribution shape. When the value is positive the shape is called Leptokurtic which indicates peaked shaped distribution compared to normal bell curve.

Pearson 2

Karl Pearson is credited with developing the formula below to measure the Coefficient of Skewness. The formula compares the sample median with the standard deviation of the same distribution.