The following are example questions that will test your knowledge of the concepts within the DEFINE phase of a DMAIC and DFSS Six Sigma project.

**1) The customer specifications for a process, referred to as the Voice of the Customer, will provide which of the following. There may be more than one correct answer**.

A) Upper Control Limit - UCL

B) Target value within the specification limits

C) Upper Specification Limit - USL

D) Lower Specification Limit - LSL

E) Lower Control Limit - LCL

Answers: B,C,D

Choices A and E are functions of the process. The goal is to have the Voice of the Process (VOP) perform within the Voice of the Customer (VOC) specs. You may get one, two, or all three of the customer specs and they may also provide attribute information, and more.

**2) Which of the following are characteristics of the inputs to a process**.

A) these are referred to as the x variables

B) these are the "causes" that create the effect

C) these are the variables the team seeks to monitor

D) A process output is a function of its inputs

Answer: A,B,D

C is a characteristic of the effect, or y-variable. The team wants to CONTROL the inputs and MONITOR the outputs.

**3) Inferential statistics is**

A) statistics derived from analyzing population

B) inferring population parameters from samples of data

C) inferring sample statistics from population parameters

Answer: B

**4) The Weighted Average Cost of Capital is commonly used to evaluate capital projects. The WACC is also known as what other terms, there may be more than one answer**.

A) Cost of Capital

B) Hurdle Rate

C) Discount Rate

D) Interest Rate

Answer: A, B, C are all commonly used and interchangeable terms with WACC.

Projects should exceed the WACC which is the cost of raising funds for a capital funding project according to your firm's targeted capital structure.

See the Financial Savings module for more information on capturing savings for your Six Sigma project.

**5) Pareto charts are very commonly used in Six Sigma projects, often in many phases and throughout all facets of a business. Which best describes the Pareto Chart**.

A) A timeline indicating the progression of the project

B) A risk analysis of input variables

C) A chart used to plot occurrences of categorical data

D) A trend analysis chart to predict future response

Answer: C

A Pareto chart is usually a vertical bar chart with occurrences on the y-axis and the categories along the x-axis. The are usually sorted with the highest or most frequent category on the left side and going right in descending order. At a high level, the Pareto Chart is looking to quickly identify 20% of the causes that are responsible for about 80% of the defects.

**6) The tool used to ensure that all effected people are involved and engaged is the**

A) Stakeholder Analysis

B) Gantt Chart

C) Project Contract

D) Prioritization Matrix

Answer: A

A Stakeholder Analysis is similar to a "Communications Plan" that is used to measure the level of resistance of stakeholders.

**7) The output of the Six Sigma methodology is ultimately intended to:**

A) increase profits

B) increase sales

C) reduce defects

D) reduce set-up times and wastes

Answer: A.

Cash flow could also be an intention but the others are leading indicators to the ultimate goal of making money. Increasing sales isn't always good, only if it adds more profit.

**8) The tool used to organize information about complex or unfamiliar topics for which the team may have little knowledge is**

A) Prioritization Matrix

B) Gantt Chart

C) Affinity Diagram

D) CTQ Diagram

Answer: C

The Affinity Diagram is the simplest brainstorming exercise that visually organizes ideas and like concepts. It is easy to use and can get the team moving forward together by bringing out each person's input, thoughts, and questions.

See the Affinity Diagram module for more information.

**9) In which type of distribution of data are all three measures of central tendency are approximately the same**.

A) Uniform

B) Normal

C) Binomial

D) Poisson

Answer: B

**10) Using the following data set find the mean**.

**{1, 3, 8, 3, 7, 11, 8, 3, 9, 10}**

A) 6

B) 7.5

C) 3

D) 6.3

Answer: D

**11) Using the following data set find the median**.

**{1, 3, 8, 3, 7, 11, 8, 3, 9, 10}**

A) 6

B) 7.5

C) 3

D) 6.3

Answer: B

**12) Using the following data set find the mode**.

**{1, 3, 8, 3, 7, 11, 8, 3, 9, 10}**

A) 6

B) 7.5

C) 3

D) 6.3

Answer: C

**13) A tool used to analyze the Voice of the Customer (customer requirements) is called:**

A) SPC Charts

B) Gantt Chart

C) Loading and Takt Time Modeling

D) Kano Model

Answer: D

See the Kano Model module for more information.

**14) Given the following findings on the number of defects that were produced among 10 machines, which is the proper category listing from left to right when creating a Pareto Chart**.

Machine 4: 98 defects

Machine 9: 65 defects

Machine 3: 76 defects

Machine 7: 42 defects

Others : 45 defects

A) 4,3,9,Others,7

B) Others,4,9,3,7

C) 9,4,7,Others,3

D) 4,3,9,7,Others

Answer: D

Many times the number of categories you want shown can be manipulated but as general rule show enough categories on the Pareto Chart to represent at least 80% of the defects from highest to lowest going left to right and the "Others" amount is shown on the right.

**15) Sometimes the Problem Statement and scope of a Project Contract will get refined upon early findings of the Six Sigma team in the Define phase. Which of these tools can help create a more accurate Project Contract:**

A) SPC

B) MSA

C) SIPOC

D) Scrap and Rework information (pcs, cost by operation, etc)

Answer: C and D

This information can help assess whether the scope is correct, the baseline is accurate and target is achievable. The sooner the better to understand any changes to the Project Contract.

**16) What is the future value given that the projected interest rate is 6% annually for 10 yrs. Each the end of each month a payment of $100 will be deposited. There was an initial deposit (or present value) of $2,000.**

A) $4,900

B) $126,663

C) $20,027

D) $200,294

Answer: C

Ensure the units are consistent. Convert the 6%/yr to an monthly rate which is 0.5%. There are 120 periods over 10 years. The payments made at the END of each month is $100. Click here to learn more about this topic.

**17) Statistics that summarize a population are**

A) inferential statistics

B) nominal statistics

C) ordinal statistics

D) descriptive statistics

Answer: D

**18) Which type of diagram is used to show the complexity of a problem by ****showing all combinations or outcomes**

A) Tree

B) Pareto

C) Scatter

D) Fishbone

Answer: A

**19) Which can be used to find and quantify a source of a problem **

A) Pareto Chart

B) Stem-Leaf Plot

C) Kano Model

D) Fishbone Diagram

Answer: A

The Pareto Chart is probably one of the most commonly used tools to provide quick insight into a problem.

**20) As part of a Six Sigma project, a heat treatment facility learned that they could reduce the temperature to heat treat their customers parts by 15 ^{o} F and still maintain the necessary hardness and tensile strength required. For every 1^{o} F of reduced temperature, the company estimates a savings of $2,000/month in reduced natural gas. What are the annual savings for this improvement? **

A) $30,000/year

B) $15,000/year

C) $360,000/year

D) $36,000/year

Answer: C

The savings are $24,000/year ($2,000/month * 12 months) for every 1^{o} F. Multiply $24,000 by 15^{o} F = $360,000/yr. The other benefits are a reduced carbon footprint and this can help a company's ISO 14001 program as well.

The following are example questions that will test your knowledge of the concepts within the MEASURE phase of a DMAIC Six Sigma project.

**1) If more than 25% of the points on a R-chart are equal to zero than this is most likely a problem with what part of the measurement system?**

A) Stability

B) Repeatability

C) Reproducibility

D) Resolution

Answer: D

The measurement system should record data ten times greater than will be used in project metric. If you are using feet as a unit of measurement for your metric, then take all your measurements in inches (this is 1/12 resolution) and satisifies the 10-bucket rule.

If you are using a metric measured in X.XX (hundredths), then record all your measurements in X.XXX (thousandths)

What else may indicate poor gage resolution? Plateaus (flat spots) of the data on the r-chart and if r-bar is too small

**2) Repeatability is variation "within", and reproducibility is the variation "among" or "between all"**.

A) True

B) False

Answer: True

Repeatability would describe the variation within one appraisers own assessments across multiple assessments. Answers the question of how repeatable are his/her answers each time assessing the same thing.

**3) Sampling taken of the first 100 patients in a hospital directory listed by alphabetical order is known as**

A) systematic random sampling

B) random sampling

C) stratified random sampling

D) convenience sampling

Answer: D

**4) A sampling strategy that employs a system such as using every 20th part that comes from a machine is an example of**

A) systematic random sampling

B) random sampling

C) stratified random sampling

D) convenience sampling

Answer: A

**5) A measurement of lumens, decibels, temperature, and ohms, are examples of**

A) continuous data

B) attribute data

C) locational data

D) discrete data

Answer: A

**6) Which subjective analysis tool is used to visualize the relationships between many inputs, x's, and an output, Y. This tool focuses on the causes, not the symptoms**.

A) Cause and Effect (Prioritization) Matrix

B) Fishbone (Ishakawa) Diagram

C) FMEA

D) Run chart

Answer: B

The Fishbone Diagram is usually the first in a line of subjective tools to help reduce the numerous trivial inputs to the vital few inputs that cause the effect (Y). The next tool can be the C/E (Prioritization) Matrix or the FMEA.

**7) Which of the following yield metrics does not account for rework or hidden factory loss**.

A) Rolled Throughput Yield

B) RTY

C) Normalized Yield

D) TPY

E) Final Yield

Answer: E

Final Yield (FY) does not account for the hidden factory or rework. It is simply the final number of acceptable pieces at the end divided by the quantity started in a process. It does not tell you how many were reworked at each step of the process.

**8) If there are 11 orders, and each order has 2 items, and there are 3 opportunities for a defect (such as incorrect item, wrong price, and damaged), then how many total opportunities are there for a defect?**

A) 22

B) 11

C) 66

D) 33

Answer: 66

**9) Which of the following are types of errors in conducting a survey or other methods of gathering the Voice of the Customer (VOC)?**

A) Coverage error

B) Nonrespondent error

C) Sampling Error

D) Measurement Error

E) All of the above

Answer: All of the above

**10) What is the probability that Z is greater than or equal to -1.96 and smaller than or equal to -1.4. Z is a standard normal random variable?**

A) 0.9821

B) 0.0555

C) 0.0558

D) 0.98

Answer: C

Reference the Z-table to help solve.

**11) Given a normal bell curve (the empirical rule) which interval contains 68% of the measurements?**

A) +/- 1 sigma

B) +/- 2 sigma

C) +/- 3 sigma

D) None of the above

Answer: A

Reference the normal distribution page.

**12) Which measurement scale measures by rank only and does not contain any precise measurements?**

A) Nominal

B) Interval

C) Ratio

D) Ordinal

Answer: D

**13) A person flips a coin six times and has landed heads each time. What is the probability that heads is the result on the next flip? Assume the coin is fair**.

A) 0.5

B) 50%

C) 87%

D) A and B

Answer: D

The probability is the same each flip. Each flip has a 50/50 chance for each result. Both A and B are the same answer just expressed differently on paper.

**14) Which of the following is not a component of measurement system error**.

A) Linearity

B) Resolution

C) Calibration

D) Reproducibility

Answer: C

Reference the Measurement Systems Analysis page.

**15) Which of the following measurement scale applies if measuring the circumference of a cylinder in millimeters**.

A) Interval

B) Nominal

C) Ordinal

D) Ratio

Answer: D

There is a defined absolute zero starting point in this case and it is continuous data that can have an infinite number of values. Reference the Data Classification page.

**16) Which of the following devices are examples of attribute measurement devices**.

A) Plug gages

B) Calipers

C) Dial indicators

D) Limit length gages

E) Micrometers

Answer: A&D

The other two choices are variable gages. Variable gages provide more information that attribute gages and are preferred for Six Sigma project.

**17) Which of the following are considered graphical methods. There may be more than one correct answer**.

B) Histogram

F) Trend Analysis

G) Line Chart

H) Pie Chart

I) Radar Chart

J) Bubble Chart

K) Radar Chart

Answer: All of the above

(and there are more)

**18) What does the line that divides the box in a box plot represent**.

A) Median of the data set

B) Mode of the data set

C) Mean of the data set

D) Lower quartile of the data set

E) An outlier

Answer: A

Reference the Box Plot page.

**19) The probability density function (PDF) of a continuous random variable is known to be 0.08x, where x is valid from 0 to 5. Find the probability (cumulative distribution function) of x being less or equal to 2.3**.

A) 0.08

B) 0.184

C) 0.092

D) 0.2116

Answer: D

Reference the CDF page.

**20) As a graphical method, Stem and Leaf Plots have grown in popularity since they easily handle large amounts of data and shows each data point**.

TRUE

FALSE

Answer: False

Reference the Stem and Leaf Plot page.

**21) ****The following information is provided. Find the Gage Error (GRR%) for the variable gage study. The average range is 1.6 and the value for the distribution of the average range is 1.19. The tolerance is 20%.**

A) 4.33%

B) 20.00%

C) 1.600%

D) 34.62%

Answer: D

The GRR = (5.15 * 1.6) / 1.19 = 6.924

The GRR% = (6.924 * 100) / 20 = 34.62%, which exceed the limits and thus the gage failed and must be replaced.

**22) ****The following information is provided for a particular failure mode on a FMEA. Find the Risk Priority Number (RPN)**

**Severity: 4****Occurrence: 5****Detection: 3**

A) 600

B) 20

C) 60

D) Not enough information

Answer: C

The RPN is the product of the three values. The higher the RPN, this higher risk that the failure mode presents to the customer and should be given priority to mistake-proof. See the FMEA for more details.

**23) ****Match the following shapes with their representation in a flowchart**

Diamond

Rectangle

Oval

Parallelogram

A) Decision

B) Start or Termination point

C) Process

D) Input / Output

Answer: A diamond represents a Decision. A rectangle represents a Process. An oval represents a Start or End point. A parallelogram represents and Input or Output.

**24) When the tables are used correctly, the Z-score is an indication of:**

A) the area under the curve outside of the control limits

B) mean of the data

C) proportion of the data between X and the mean

D) the area under the curve outside of the specification limits

Answer: C

**25) Which tool is best suited to find rework loops and non-value added steps.**

A) Process Map

B) SIPOC

C) Affinity Diagram

D) Ishakawa Diagram

Answer: A

**26)**** Given a perfectly normal distribution, what is the probability of z>0. **

A) 0.25

B) 0.50

C) 50%

D) 0.00

Answer: B & C which are the same

**27) Which data classification is ranking customer satisfaction on a scale of 1-5.**

A) Ratio

B) Interval

C) Binary

D) Nominal

Answer: B

**28) Which data classification is a measurement of Centigrade or Fahrenheit?**

A) Ratio

B) Interval

C) Binary

D) Nominal

Answer: B

Ratio is not correct since zero degrees does not mean the absence of temperature.

**29) Which data classification is for measuring lengths or weight?**

A) Ratio

B) Interval

C) Binary

D) Nominal

Answer: A

See the Data Classification page for more information on four measurement scales: Nominal, Ordinal, Interval, and Ratio

**30) Which measurement scales are categorical?**

A) Ratio

B) Interval

C) Ordinal

D) Nominal

Answer: C & D

See the Data Classification page for more information on four measurement scales: Nominal, Ordinal, Interval, and Ratio

**31) Which measurement scales are numerical?**

A) Ratio

B) Interval

C) Ordinal

D) Nominal

Answer: A & B

See the Data Classification page for more information on four measurement scales: Nominal, Ordinal, Interval, and Ratio

**31) The amount of time needed to manufacture each car on an assembly line is best described as _____ type of data.**

A) Ratio

B) Interval

C) Ordinal

D) Nominal

Answer: A

**33) A given set of data has a mode of 15.2, median 15.7, and mean of 17.1, which best describes the distribution:**

A) Right skewed

B) Bi modal

C) Left Skewed

D) Normal

Answer: A

Skewness is a measurement of symmetry and kurtosis is a measurement of "peakedness or flatness" of a distribution.

In this example, the Sk value is >0 (for a right-skewed distribution)

**34) Multi-vari charts are used most often to analyze which type(s) of variation?**

A) positional

B) cyclical

C) temporal

D) All of the above

Answer: All of the above

Click here to learn more about multi-vari charts.

**35) Which statement(s) correctly describe the statistical term Degrees of Freedom**

A) the number of independent pieces of information available to estimate a statistic

B) number of values in the final calculation of a statistic that are free to vary

C) is used to calculate the number of measurements needed to make an unbiased estimate of a statistic.

D) dF = n-1 when using Paired t-test or One Sample t-test

Answer: All of the above

Click here to learn more about Degrees of Freedom

**36) If a normal distribution has a mean of 35 and a standard deviation of 10, 95% of the distribution can be found between which two values?**

A) 0, 70

B) 15, 55

C) 25, 45

D) 45, 105

Answer: B. 95% of the distribution (area under the curve) is 1.96 standard deviations from the mean which can be estimated at 2.Therefore 35-20 = 15 is the lower value and 35+20 = 55 is the upper value.

**37) Which of the following has a most significant impact on the shape of a normal distrubution**

A) Standard deviation or variance

B) Median

C) Mode

D) Mean

Answer: A

**38) If a normal distribution has a mean of 35 and a variance of 25, 68% of the distribution can be found between which two values?**

A) 30, 40

B) 25, 45

C) 0, 70

D) 20, 50

Answer: A. 68% of the distribution (area under the curve) is about +/- 1 standard deviation from the mean. The standard deviation is the square root of the variance and therefore = 5. Therefore 35-5 = 30 is the lower value and 35+5 = 40 is the upper value.

**39) A distribution of measurements for the length of widgets was found to have a mean of 50.00mm and a standard deviation of 1.50mm. Approximately what percent of measurements are between 47.00mm and 53.00mm? **

A) 100%

B) 68%

C) 95%

D) 99%

Answer: C. The measurements of 47.00mm and 53.00mm are two standard deviations away from the mean of 50.00mm. Therefore about 95% of the values recorded are between 47.00mm and 53.00mm.

The following are example questions that will test your knowledge of the concepts within the ANALYZE phase of a DMAIC Six Sigma project.

**1) Type I error and Type II error are respectively known as**

A) P-value risk

B) Confidence Interval and Power risk

C) Alpha and Beta risk

D) Consumers Risk

Answer: C

Alpha risk = Type I Error = Significance Level = Producers Risk = False Positive. This is when an EFFECT is detected that does not exist.

Beta risk = Type II Error = Consumers Risk = False Negative. This is when an EFFECT is not detected when it does exist.

**2) While ANOVA is "Analysis of Variance", it is used for which reason?**

A) Testing variances of two populations

B) Testing variance of two or more populations

C) Testing capability of process

D) Testing means of a population

Answer: D

ANOVA assumes homogeniety of variance which means the variance within each factor are equal. Usually used to test >2 means.

**3) What other assumption(s) are required to use ANOVA:**

A) Measurement system is linear and reproducible

B) Data sets are normally distributed

C) Data sets have equal means

D) Data sets have equal variances

Answer: B,D

Choice A is already confirmed, you should have an adequate measurement system as verified in the MEASURE phase.

Choice C is the reason for using ANOVA, which is a test of the equality of the means (can be used like the student t-test to test the equality of two means).

**4) Confidence Intervals are not always symmtrical, but what is true about confidence intervals when the sample size changes**.

A) As sample size, n, decreases the confidence interval spread increases.

B) As sample size, n, increases the confidence interval spread decreases.

C) The sample size, n, does not affect the confidence interval.

Answer: A,B.

The larger the width, or spread, of the confidence interval the weaker the estimate of the mean or variance. Logically, the more data and samples taken within a population, the more confidence you will have in estimating the population mean or variance. As n approaches infinity, the sample average approaches the population mean.

**5) The standard deviation of a standard normal distribution is**

A) always equal to one

B) is always equal to zero

C) can be any positive value

D) can be any negative value

Answer: A

The standard deviation of a __standard__ normal distribution is always equal to one. the keyword is *standard*

**6) Which of the following are characteristics of the normal probability distribution?**

A) 89% of the time the random variable assumes a value within plus and minus 2 standard deviation of its mean

B) Symmetry

C) The total area under the curve is always equal to 1

D) 99.72% of the time the random variable assumes a value within +/- 3 standard deviation of its mean

Answers: B,C,D

**7) Which term measures only a linear relationship between two variables and is denoted by an "r" value and it will always range from -1.0 to +1.0. As the value approaches 0 their is less linear correlation, or dependence) of the variables**.

A) Coefficient of Determination

B) Covariance

C) Standard Deviation

D) Coefficient of Correlation

Answer: D

The most common incorrect answer is A. The Coefficient of Determination from 0-1 (0%-100%) represented by r-squared. The proportion of variability of the dependent variable (Y) accounted for or explained by the independent variable (x) equal to the coefficient of correlation value squared.

Click here to learn more about Correlation.

**8) A sample of 5 widgets from a population of 30 widgets is taken without replacement for review. Determine the probability of finding 2 defective widgets when the population is known to have 14 defective widgets and 16 non-defective widgets**.

A) 20.5%

B) 34.8%

C) 48.2%

D) 35.8%

Answer: D

Click here to learn more about the hypergeometric distribution.

**9) What is the Confidence Interval for a population mean given the following information and using a 95% confidence level**.

Alpha risk = Level of Significance = 1 - Confidence Level = 0.05

Population Standard Deviation = 6.48

Sample Size = 27

Sample Mean (x-bar) = 50

A) 50 to 52.44

B) 47.56 to 52.44

C) 47.56 to 50

D) 47.65 to 52.44

Answer: B

The interval is 50 +/- 2.44, or 47.56 to 52.44. The other point of this question is to draw attention to how close B and D appear. Carefully select the correct response. It is easy to make quick mental assumptions without realizing it and selecting the wrong answer when you actually calculated it correctly.

**10) What is the standard deviation for 20 flips of a fair coin? Assume the binomial distribution**.

A) 0.5

B) 5

C) 2.236

D) 10

Answer: C

The mean is 10. The answer is not 5 but rather the square root of 5.

**11) Scores are collected for 10 students that took the same competency test before and after training. Which t-test should be applied to compare before and after average scores?**

A) 1-sample t test

B) 2-sample t test

C) Paired sample t test

Answer: C

**12) For the question above what is the correct null hypothesis?**

A) Mean_{BEFORE} = Mean_{AFTER}

B) Mean_{BEFORE} + Mean_{AFTER} = 0

Answer: A

**13) A comparison on the average braking response time between two drivers when a light turns red is being conducted. You have recorded the times and determined the data is normal. Which test would you use?**

A) 1-sample T test

B) 2-sample T test

C) Paired sample T test

Answer: B

**14) Which of these are one-tailed test and which are two-tailed test?**

A) Ho: Mean A = Mean B

B) Ho: Mean A < Mean B

C) Ho: Mean A > Mean B

Answer: A is two tailed. B & C are one tailed.

**15) Assuming a normal distribution, the mean salaries of personal trainers is found to be $40,000 with a standard deviation of $5,000. Determine the probability that a randomly selected individual with an MBA degree will get a starting salary of at least $47,500?**

A) 0.05

B) 0.0668

C) 0.668

D) 0.4332

Answer: B

Z score of 1.5 = 0.4332. Since we are looking at only one tail of the curve the answer is 0.5-0.4332 = 0.0668

**16) Which is the most applicable t-test when testing two sample means when their respective population standard deviations are unknown but considered equal. The data is recorded in pairs and each pair has a difference, d. The degrees of freedom are n-1 where n = number of samples**.

The following are assumptions:

Normal Distribution

Two dependent samples

Always two-tailed test

Sd = standard deviation of the differences of all samples

A) One sample t-test

B) Paired t-test

C) Two sample t-test

D) Anova Paired t-test

Answer: B

**17) Which test is best choice to use when comparing medians of non-parametric data?**

A) Kruskal-Wallis

B) Mood's Median

C) Mann-Whitney

D) Wilcoxon

Answer: D.

Click here to review hypothesis test flowcharts.

**18) A hemming operation has an OEE of 86.4% and Takt Time is 4s/pc and a known defective rate of 2.0%. If a sample of five parts are selected, what is the probability that the sample contains exactly two defective parts?**

A) 0.02

B) 0.985

C) 0.022

D) 0.0038

Answer: D

The OEE value and Takt Time are meaningless for this problem. It is important for Six Sigma project managers to sort through all the available information and use the correct inputs as well as selecting the correct formulas.

Use:

n= 5

P = 0.02

Solve for probability that x = 2 using the binomial equation.

**19) A statistical test used to test a hypothesis of independence of two attribute variables is:**

A) Paired t-test

B) Mood's Median test

C) F test

D) Chi-square test

Answer: D

**20) A parametric test used to test a hypothesis of two variances?**

A) Paired t test

B) Mood's Median test

C) F test

D) Chi-square test

Answer: C

**21) What is the Coefficient of Correlation between these two sets of data. Each data point is plotted for a monthly period for 12 months?**

Marketing costs: $6,421, $6,328, $7,856, $5,467, $4,265, $5,012, $8,023, $7,625, $4,569, $2,548, $6,985, $6,849

Cereal Sales: $55,462, $53,211, $68,021, $45,698, $40,120, $43,296, $70,231, 66,849, $42,065, $31,456, $61,365, $58,312

A) r = 0.9851 with strong correlation

B) r = 0.0149 with weak correlation

C) Not enough information to determine

D) r = -0.0149 with strong negative correlation

Answer: A

**22) When calculating the Chi-Square value for attribute data there are two variables needed to compare proportions. The observed values and expected values are entered into contingency tables. How are the expected values calculated from a contingency table?**

A) Grand total / (Row total * column total)

B) Row total - column total

C) (Row total * Column total) / Grand total

D) None

Answer: C

**23) Which test is preferred when comparing more than two means?**

A) ANOVA

B) t-test

C) Paired t-test

D) Chi-Square

Answer: A

The t-test is for comparing 2 means generally but ANOVA is used.

**24) Which of these are non-parametric test the use the MEDIAN and the measure of central tendency to describe the data where the data is ranked to conduct an one-way ANOVA.**

A) Friedman Test

B) Levene's Test

C) t-test

D) Kruskal-Wallis

Answer: D

The t-test is a parametric test. The other three are non-parametric tests.

**25) Which of these are non-parametric test is not as robust as the Mood's Median test when analyzing outliers.**

A) Friedman Test

B) Levene's Test

C) Kruskal-Wallis

D) Mann-Whitney

Answer: C

**26) Which of these are non-parametric test is used to determine if two independent sets of data are from the same population and is the option to use when t-test can not be since the data does not meet normality assumptions.**

A) Friedman Test

B) Levene's Test

C) Kruskal-Wallis

D) Mann-Whitney

Answer: D

**27) Multi-Vari charts analyze which types of variation?**

A) Cyclical

B) Temporal

C) Positional

D) All of the above

Answer: D

**28) What is the appropriate parametric test for means assuming you have two samples of <30 data points with unknown variances that are assumed to be equal?**

A) Pooled T-test

B) F Test

C) Non-pooled T-test

D) ANOVA

Answer: A

**29) What is the appropriate parametric test for variances assuming you have one sample?**

A) Chi-square

B) F Test

C) Non-pooled T-test

D) ANOVA

Answer: A

**30) ****A sample of 5 parts are drawn without replacement from a total population of 30 parts. Determine the probability of getting exactly 2 defective parts. The population is known to have 14 defective parts.**

A) 35.76%

B) 30.65%

C) 43.30%

D) None of the above

Answer: A

The hypergeometric distribution PMF is applied. The degrees of freedom (dF) = n_{1} + n_{2} - 2

**31) **Determine the Degrees of Freedom for a two sample t-test when one sample size is 20 and the other sample size is 25. The alpha-risk is 5%.

A) 24

B) 45

C) 43

D) None of the above

Answer: C

dF = n_{1}+n_{2}-2 = 20+25-2 = 43. The alpha-risk is not relevant to this question. See the t-distribution page for more information.

**32) Which statement(s) are correct about Correlation and Covariance.**

A) Covariance values range from +1.0 to -1.0

B) A positive value for Covariance indicates an inverse relationship between x and y

C) A positive value to Correlation indicates that Covariance could negative

D) A value of 0 for Covariance means their is a perfect relationship between x and y

Answer: None are correct

See the Covariance and Correlation pages for more information.

**33) Based on the Scatter Diagram below what conclusions can be drawn about the linear Correlation. Assume the values increase on the x axis to the right and the y-axis to the top of the chart.**

A) Weak negative correlation

B) Correlation value is likely -1.0 or lower

C) Weak positive correlation

D) Strong positive correlation

Answer: A

B is not correct since correlation values must be between -1.0 and 1.0. There is not strong evidence to suggest one "x" variable causes a direct impact on the "y" variable. Correlation, no matter how strong (positive or negative) it may appear, never implies causation. There could be other variables behind the one charted that could be a factor.

**34) A correlation analysis provides a value (correlation coefficient) for what type of relationship between two variables?**

A) Linear

B) Bimodal

C) Exponential

Answer: A

This analysis provides a linear relationship value but there may be a non-linear association of the two variables.

**35) Which statements are true about defining the shape of a distribution?**

A) Kurtosis and skewness are used to describe the shape of a distribution.

B) If the skewness coefficient is <0 the distribution is left-skewed.

C) As the skewness coefficient approaches 0 the distribution takes on the shape of a normal distribution.

Answer: A, B, C

And if Sk > 0 then this represent a right-skewed distribution. There are natural distributions that will be right or left skewed and will not meet the assumptions of normality. Click here and go to the bottom of the page to read more.

**36) Statistical power is **

A) the likelihood of detecting an effect when an effect is present

B) 1 – β (beta-risk)

C) 1 - Level of Significance

D) the chance of rejecting the null hypothesis when the null hypothesis is false

Answer: A,B,D. Visit Power & Sample Size for more information.

**37) Which statement(s) are correct about linear correlation.**

A) If the p-value is < than the chosen alpha risk then there __is__ a statistically significant correlation.

B) If the r-critical value is less than the r-calculated value then reject the null and infer the alternative that there __is__ a statistically significant correlation.

C) A strong linear correlation implies that one "x" variable is the only variable affecting the response.

D) The Pearson Correlation Coefficient describes the nature of the relation with an equation.

Answer: A, B

Regardless of how strong a relationship appears there could be other variables involved.

Regression is used to develop an equation such as y=mx+b, which is a simple regression equation to describe the nature of the relationship.

A) The Margin of Error (ME) is often expressed in the following formulas so it is the SE times an adjustment value which is a *z**-critical* or *t-critical* value:

- ME = critical value * Standard Error (SE)

B) The ME is the entire width of a symmetric confidence interval.

C) The ME can be reduced by increasing the sample size.

D) Increase the chosen alpha-risk

Answer: A,C,D

Choice B is not correct since the ME is half the width of a symmetric confidence interval. Click here to learn more about Margin of Error.

Click here to get a ME calculator along with several other Templates and Calculators.

**39) The sampling distribution equals the relative frequency distribution when**

A) there are an infinite number of samples

B) >30 samples

C) <30 samples

Answer: A

**40) A Green Belt ran a statistical test and found that the p-value was >0.05. What statement below fits that result.**

A) There is not a difference or relationship with at least 95% confidence.

B) Left-Skewed Distribution

C) Reject the H_{O} and infer the H_{A}

D) There is a difference or relationship with at least 95% confidence

Answer: A. See this section on p-value to learn more.

**41) A Black Belt is comparing the expected test results from 23 students in the same class. Which of the following statistical test is most applicable?**

A) ANOVA

B) t-test

C) paired-t test

D) z-test

Answer: A

Analysis of Variance (ANOVA) evaluates the differences among means of >2 samples. ANOVA analysis determines whether at least two of the students have significantly different scores.

**42) True or False: The standard deviation of the sampling distribution of the mean is called the Standard Error of the Mean.**

Answer: True. It measures how much the sample means differ from each other.

**43) The Central Limit Theorem states that regardless of the population from which data is drawn, the sampling distribution of the mean when sufficiently large is:**

A) Unimodal Distribution

B) Left-Skewed Distribution

C) Right-Skewed Distribution

D) Normally Distributed

Answer: D. Click here to learn more about the Central Limit Theorem.

The following are example questions that will test your knowledge of the concepts within the IMPROVE phase of a DMAIC Six Sigma project.

**1) A four level, three factor, DOE is being conducted but due to time and budget constraints a half factorial is used. How many treatment combinations are there?**

A) 64

B) 16

C) 41

D) 32

Answer: D

Recall that factors are the power which the levels are raised. Four cubed is 64, and a half factorial means that half of the full factorial combinations will be used, which is 32.

**2) Which of the following are true of the Evolutionary Operations (EVOP):**

A) Used when process is not in control

B) Limited to two or less input variables

C) high experimental risk

D) Uses large samples sizes to detect small experimental differences

Answer: D

EVOP is evolutionary in that it learns from existing behaviors to predict future treatments to improve the response. It is normally used when a process is in statistical control.

**3) Waste Elimination is a key component of the IMPROVE phase and will involve cultural transformation and high assurance that team members and affected stakeholders are ready to change. What tools are commonly used (not all depending on the project) at this time**.

A) Stakeholder Analysis

B) Takt Time / Operator Balance Charts

C) SPC

D) Kanban / Pull Principles

E) Standard Work

F) Visual Management

G) Implementation of TPM

H) ANOVA

I) SMED

Answer: All choices EXCEPT C and H

**4) The goal of IMPROVE is to make a fundamental change or prove through trials that a fundamental change is possible by eliminating waste and determining the relationship of the key input variables that affect the outputs of the process. **

**When a process is in statistical control what are possible steps to improve it to a better desired performance level**.

A) Target and resolve the special causes

B) Minimize the common cause variation

C) Influence the customer to open up their specification limits

D) Open up the process control limits

Answer: B

The special causes must be eliminated to have an statistically controlled process. It may be possible that a customer has unrealistic specifications or those that are too tight that it is not practical or financially viable, it may be possible to work with them and prove through testing and validation that the specifications can be opened or changed.

It is **not possible** to select these values as you would like them, they are set by a formula through the process itself, they represent the Voice of the Process.

**5) Which of the following are characteristics of a Latin Square Design DOE**.

A) Permit analysis of main effects only

B) Number of columns, rows, and treatments must be equal

C) Interactions should not occur between the row and column factors

D) Generally require less experimentation to obtain main treatment results

E) All of the above

Answer: E

**6) Which of the following are assumptions are the errors for a mixture and a factorial experiment?**

A) Have common variance

B) Independent

C) Distributed with zero mean

D) All of the above

Answer: D

**7) In a Full Factorial experiment with 3 levels and 5 factors how many possible combinations are there**.

A) 243

B) 15

C) 30

D) None of the above

Answer: A

The number of possible combination = levels^{factors} (3^{5}) which is 243.

**8) In Fractional Factorial experiment is subset of combinations from a Full Factorial experiment**.

Answer: TRUE

**9) If a half fractional factorial experiment is determined to be most practical and economical where there are two levels and five factors, how many runs will be performed in the DOE**.

A) 16

B) 32

C) 5

D) None of the above

Answer: A

The number of possible combination = levels^{factors} so there are 32 total combinations and half of them combinations will be tested so that is equal to 16.

**10) Which would be used to evaluate the significance of factors in a fractional or full factorial experiment?**

A) FMEA

B) Multi-vari charts

C) ANOVA

D) SPC

Answer: C

**11) What are some advantages of fractional factorial experiments?**

A) Less time involved than OFAT (one factor at a time)

B) Less time involved than Full Fractional Factorial

C) Additional precision due to hidden replication

D) All of the above

Answer: D

**12) Which of these statements is incorrect about factorial experiments?**

A) Response = output = dependent variable

B) Response = sum of process mean + variation about the mean

C) Factors = dependent variables

D) Variation about the mean is sum of factors + interactions + unexplained residuals (or experimental error)

Answer: C

Factors are independent variables and inputs that are part of the total variation of the mean. Answer: D

**13) How many runs will there be given the information below for a Full Factorial DOE?**

2 Levels

1 Block

0 Center Points

3 Replicates

3 Factors

A) 6

B) 24

C) 30

D) 14

Answer: B

**14) Which is used to create a model of the affect on an output by the
variation in two or more of the inputs?**

A) Linear Regression

B) Correlation Coefficient

C) Multiple Regression

D) Coefficient of Determination

Answer: C

**15) A set of data is analyzed and the following regression equation was generated: **

**Pieces produced = 8.0 + 0.122 (hours of human work) **

**Which statement(s) are true:**

A) pieces produced is the dependent variable, y, (aka. the output)

B) the hours of human work is the dependent variable

C) the independent variable, x, is the hours of human work

D) with 0 hours of human work, there are still 8 pieces produced

Answer: A,C, D

The amount of hours of human work is the 'x' or the input. This is the independent variable. The dependent variable is 'y' which means the amount of pieces produced depends on the amount of hours of human work.

Recall that y=mx+b. The slope of this formula is 0.122 and the y-intercept is 8.0.

**16) Which term represents the approach to significantly reduce the non-value added steps within setups?**

A) SMED

B) JIT

C) TPM

D) QFD

Answer: A

SMED = single minute exchange of dies

**17) Which term best describes continuous improvement.**

A) TPM

B) Muda

C) Gemba

D) Kaizen

Answer: D

The following are example questions that will test your knowledge of the concepts within the CONTROL phase of a DMAIC Six Sigma project.

**1) Which of the following tools is handed off from the Six Sigma Black Belt (or other Belt) to the Process Owner after control of the effect has been established and statistically proven?**

A) Pareto

B) Prioritization Matrix

C) DOE

D) Control Plan

E) Gantt Chart

Answer: D

The process owner uses a Control Plan to help monitor and react to Y and its behavior after the project is formally closed.

**2) Which of the following are characteristics of the RPN? There may be more than one answer**.

A) The lower the value the lower the risk to the project output, Y.

B) it equals the SEV * DET * OCC

C) it is found on the FMEA as subjective analysis tool

D) it is compared to the P-value to make a decision on null hypothesis.

Answer: A,B,C

It is also the EFFECTS * CAUSES * CONTROLS - which is another way of saying the SEV * DET * OCC. Visit the FMEA module for more information.

**3) With 50 samples (n=50), a process has a p-bar of 0.25. What are the upper and lower three sigma limits (UCL/LCL)?**

A) 0.25 and -0.25

B) 12.5 and -12.5

C) 21.69 and 3.31

D) 15.56 and 9.44

Answer: C

Must use the formula to calculate the NP Chart control limits.

**4) Which of the following a characteristics of the C-Chart, there may be more than one choice?**

A) Plotting number of Defects

B) Plotting continuous data

C) Poisson assumptions satisfied

D) Fixed sample size (constant)

Answer: A,C,D

Visit C Chart for more information.

**5) Which type of chart (of the choices below) is typically used when plotting continuous (can apply to attributes data) data to detect small changes over a small period of time? The moving average smoothes the variation of time therefore should not be used when looking for a point that is outside of the process control limits**.

A) I-MR

B) X-bar, R

C) EWMA

D) U-Chart

Answer: C

Visit Exponentially Weighted Moving Average Chart for more information. The most recent data point is given the most weight and as time progresses the weight of the older points decreases.

The term *exponentially* means that the weights of the older points decrease exponentially with time. CUSUM charts use equals weights for previous data points.

**6) Which of the following a characteristics of the U-Chart, there may be more than one choice?**

A) Plotting number of Defects

B) Plotting continuous data

C) Poisson assumptions satisfied

D) Variable sample size

Answer: A,C,D

Visit U Chart for more information.

**7) Which of the following a characteristics of the NP-Chart, there may be more than one choice?**

A) Plotting number of Defects

B) Plotting attributes data

C) Poisson assumptions satisfied

D) Constant sample size

Answer: B,D

Visit NP Chart for more information.

**8) If the Rolled Throughput Yield is known to be 47.5% and there are three processes what is the Normalized Yield**.

A) 47.5%

B) 10.7%

C) 78.0%

D) Not enough information to determine

Answer: C

There are three processes so K = 3

therefore NY = 0.475^{1/3}

NY = 0.780 = 78.0%

There is a 78% chance of a unit passing through one process step without rework.

**9) If the Normalized Yield (NY) is 78.0% what is the normalized defects per unit?**

A) 0.248

B) 0

C) 78.0

D) 0.475

Answer: A

The normalized defects per unit equals the negative natural log (-ln) of the NY.

**10) The following data will be applied to the questions 10-15. This type of knowledge may be used in the Measure and Control phase**.

- There are 1,000 widgets sampled.
- Each widget has 15 characteristics that are measured. If any one of the characteristics is out of customer specification then the entire widget is considered defective.
- A total of 110 widgets were found defective with a total of 228 defects found among them 110 defective widgets.

**What are the Total Opportunities for a Defect (TOP)?**

A) 15,000

B) 0

C) 550

D) 228

Answer: A

The total opportunities is the number of widgets multiplied by the number of potential defects in each widget which is 1,000 * 15 = 15,000.

**11) How many units were found to have zero defects?**

A) 110

B) 890

C) 772

D) 1000

Answer: B

**12) What is the Defects Per Unit (DPU) rate?**

A) 0.110

B) 0.772

C) 0.228

D) 228

Answer: C

Defects per unit is the number of defects found among all units sampled. Divide 228/1000 = 0.228.

**13) What is the Defects Per Million Opportunities (DPMO) rate?**

A) 12,000

B) 2280

C) 15,200

D) 228

Answer: C

DPMO = 228/15000 * 1,000,000 = 15,200

There were 228 defects found among 15,000 total opportunities therefore, applying the same ratio there would be 15,200 defects expected over the long term if there were one million opportunities.

**14) What is the Process Yield?**

A) 98.48%

B) 22.80%

C) 96.00

D) None of the above

The Process Yield is the percentage of time a defect is not created when an opportunity exist. 1.52% is the % defects (228/15,000) and the Process Yield is 100% - the % defects which is 98.48%.

Do not confuse % Defects with % Defective. The % Defective in this example is 110/1000 = 11.00%

**15) What is the long term process sigma estimate (assume zero sigma shift)?**

A) 1.13

B) -2.165

C) 6.0

D) 2.165

Answer: D

Get the DPMO and Sigma Calculator Template to further explain all of these calculations and run your own scenarios.

**16) To monitor a project after it is closed you decide to gather 17 samples of a widget each day for two weeks and examine the diameter specification measured in micrometers. Which is the most applicable control chart?**

A) x-bar, r chart

B) x-bar, s chart

C) u chart

D) I-MR chart

Answer: B

When gathering >8-10 samples the x-bar, s chart (which uses the standard deviation) becomes a better estimate for the process variation than the range.

**17) The Rolled Throughput Yield of 6 process steps 65%. The Throughput Yield (TPY) of processes 1,2,4,5, and 6 are 98%, 87%, 89%, 92%, and 99% respectively. What is the TPY of Process 3?**

A) 65%

B) 94%

C) 99%

D) Not enough information

Answer: B

The RTY is the product of all process TPY's. So set up the following and solve for x:

0.65 = (0.98*0.87*0.89*0.92*0.99)x x = 0.9404918 = 94%

Get the RTY Template to further explain all of these calculations and run your own scenarios.

See the RTY for more information.

**18) Which of the quality gurus is credited with creating control charts while working at Bell Laboratories?**

A) Ishakawa

B) Deming

C) Edwards

D) Shewhart

Answer: D

**19) ****Given a normally distributed set of data we know the following:**

**The mean length is known to be 510.7 mm****The standard deviation of the lengths is 1.49 mm**

**The customer specifies that the LSL is 507 mm and the USL is 513 mm.**

**Determine the total % of parts that can be expected to be defective?**

A) 0.66%

B) 7.46%

C) 6.8%

D) 92.54%

Answer: B

__This is a two-part problem__.

FIRST: Determine the % of parts >513 mm.

z = (513 mm - 510.7 mm) / 1.49 mm = 1.54

Referring to a z-table that score equates to** 6.8%**

SECOND: Determine the % of parts <507 mm.

z = (507 mm - 510.7 mm) / 1.49 mm = -2.48

Referring to a z-table that score equates to** 0.66%**

The total expected defective parts related to the length are 6.8% + 0.66% = **7.46%**

**20) ****If a carbon monoxide alarm goes off indicating a high level alert but there is actually not a high level then what response(s) below describe this situation:**

A) Beta Risk

B) Type I error

C) Type II error

D) False Negative

Answer: B

**21) ****If a carbon monoxide alarm does not go off indicating a high level alert but there is actually a high level then what response(s) below describe this situation:**

A) Beta Risk

B) Type I error

C) Type II error

D) False Negative

Answer: A,C,D

**22) If conducting a 2-sample T test and your conclusion is that the means are different when they are actually not would represent****:**

A) Beta Risk

B) Type I error

C) Type II error

D) False Positive

Answer: B,D

**23) If conducting an F-test and your conclusion is that the variances are the same when they are actually not would represent****:**

**A) Beta RiskB) Type I errorC) Type II errorD) False Positive**

**Answer: A,C**

**24) Instead of choosing the alternative hypothesis (indicating there is a difference), and incorrectly selecting the null hypothesis (that there is not a difference) is an example of****:**

**A) Beta RiskB) Consumers RiskC) Type II errorD) False Positive**

**Answer: A,B,C. The EFFECT was not detected when in fact there is an EFFECT.**

**25) In this phase a Six Sigma Project Manager uses ___________ to monitor and analyze key metrics to determine if the process is moving or staying with control limits.
**

**A) SPCB) MSAC) VOCD) common sense**

** Answer: SPC = Statistical Process Control. **

**26) Which of the following statements are true:**

**Cp is a measure of variation only.****Cpk is a measure of location (process average) and variation.****Cp will never be a negative number.****Cpk can be a negative number.**

**A) 1,2B) All of the aboveC) 3,4D) 1,2,3**

**Answer: B**

Click here for more information of Cpk and Cp

**27) Fill in the blank with the most common response:**

**For subgroups <10 of continuous data, use the range to estimate process variation and use a _____ type of SPC chart: For example, if appraisers are measuring parts every 30 minutes and sample and measure 6 consecutive parts each 30 minute interval then the subgroup size is 6 and the range should be used to estimate the process variation.**

**A) X-bar, RB) I-MRC) NP-ChartD) X-bar, S**

**Answer: A**

**The NP-chart is used for attribute data. The Xbar-S chart is used when plotting subgroups of size >=10. See SPC Charts for more information. **

**27) A FMEA is an important tools created in the Measure Phase and revised (updated) in the Control Phase. It's also commonly used in the manufacturing environments of all types. Select the benefits below of a FMEA (may be more than one answer).**

**A) Define the value-added process steps of a value stream mapB) Predict how, when, and where failures might happenC) Identify ways a process can fail to meet customer requirementsD) Estimates the severity, occurrence, and detection with a RPN score for the defect types**

**Answer: B,C,D**

Learn more about the FMEA here.

**28) Which of these are Predictive Maintenance techniques to help control a process:**

**A) SPCB) DOEC) Thermal ImagingD) Vibration Analysis**

**Answer: C,D**

There are other techniques discussed here.

Custom Search

**Six Sigma**

**Templates, Tables & Calculators**

**Six Sigma Certification**

**Six Sigma** Modules

*Green Belt Program (1,000+ Slides)*

*Basic Statistics*

*SPC*

*Process Mapping*

*Capability Studies*

*MSA*

*Cause & Effect Matrix*

*FMEA*

*Multivariate Analysis*

*Central Limit Theorem*

*Confidence Intervals*

*Hypothesis Testing*

*T Tests*

*1-Way Anova Test*

*Chi-Square Test*

*Correlation and Regression*

*Control Plan*

*Kaizen*

*Error Proofing*