Control charts: when, how and why – Musings by FlyingSalmon

Control charts are essential for statistical quality control purposes, such as for monitoring performance and defects measurements. Control charts are also known as Shewhart charts or process-behavior charts as they are tools used to monitor and analyze the behavior of a process over time. They help identify variations in the process and determine whether those variations are within acceptable limits or if there is a need for corrective action as well as patterns and anomalies. In this post, I’ll break down the types of control charts, when to use which ones, and how to create them.

In a typical control chart, you’ll find these components:
The values shown on the line chart are what determines the type of Control chart (mean, min-max difference, etc.) and the limit lines are typically +/-3 sigma (or, +3 standard deviations of the data for the upper limit line, and -3 standard deviations for the lower limit line).
The center horizontal line is typically the overall mean value line.
The x-axis is days or observation periods or cases.
The y-axis is the measurement (and Mean, depending on the type of the chart)

Why Control Charts Are Useful

Process Monitoring: They help track the performance of a process and ensure it remains stable and predictable.
Identifying Trends: Control charts can highlight trends or patterns in the process that may indicate potential problems.
Distinguishing Between Common and Special Causes: They help differentiate between common causes (normal process variations) and special causes (unexpected variations that may require investigation).
Quality Improvement: By identifying and addressing variations, control charts aid in continuous quality improvement efforts.

Types of Control Charts

There are different types of control charts that we can use. Which type of chart to use depends on the type of data and sample size. For example:
np chart: Used with Discrete data. I will demonstrate it further below.
p chart: Used with Discrete data. I will demonstrate it further below.
c chart: Used with Discrete data. Ideal for processes where the sample size remains the same and help track the total number of defects per unit.
Note: The n charts and c charts both have fixed sample sizes and tracking defects…but there is a difference. In n chart dataset, we use the number of units with defect, whereas in c chart, we use the total number of defects per unit.
u chart: Used with Discrete data. Useful when the sample sizes differ from one inspection period to another and helps track the number of nonconforming items per unit.
Note: The p charts and u charts are almost identical except for the data. p chart has the proportion (or percentage of defects in a sample), and u chart uses average number of defects of per unit.
xbar-r chart: Used with Continuous data. I will demonstrate it further below.
I-MR chart: Used with Continuous data. I will demonstrate it further below.

A decision tree in Figure 1 shows how to decide when to use which chart.

Continuous Data vs Discrete Data

Continuous data can take any value within a given range, including fractions and decimals. They can take on an infinite number of values within a range. Discrete data can only take specific, distinct values. They can take on a finite number of values. Example of continous data: Height, weight, temperature, time Example of discrete data: Number of students in a class, number of cars in a parking lot, number of goals scored in a soccer match.

How to Create Control Charts

We can create control charts in Excel! Here are the general steps:

Collect Data: Gather the data to monitor over time.
Organize Data: Enter the data into an Excel worksheet with time periods (or subgroups) in one column and the corresponding data points in another column.
Calculate Mean and Control Limits: The upper control limit (UCL) and lower control limit (LCL) are typically set at ±3 standard deviations from the mean.
Create a Line Chart: Add the mean, UCL, and LCL as additional data series on the chart.
Format Chart: Customize the chart to clearly display the data, mean, UCL, and LCL.

Let’s take a look at some of these charts using sample data in different scenarios.

Scenario 1: Say, Amazon.com wants to track the average time it takes to process an order: from customer placement of an order -> packaging -> delivery
So, they monitor the average processing time to ensure it stays within their acceptable limits. They tracked the time to out-for-delivery (say, in minutes) for multiple orders over several days (20 days), with each day 5 random orders are selected for the measurement.

Objective: Improve the processing time for a fulfillment center.

Method: xbar-R charts

Here’s a sample (hypothetical) data of what they collected for each day:

For the xbar chart, we’ll need to calculate the overall mean time for all days in the sample, the standard deviation, and set an upper limit for each day with the daily mean processing time + 3 sigmas, and the lower limit for each day will be daily mean processing time – 3 sigmas. For the R chart, we’ll calculate the daily range (daily maximum time – daily minimum time), then get the overall R mean for the sample, and calculate standard deviation of the R values in sample across all days. Once we have it, we calculat the upper and lower limit R values such as above (mean +/-3 sigmas). The calculated values and columns may be arranged as this:

Using the range specified in the table with colored rectangles, we can create the xbar and R charts indepedently next. A line chart with 4 series each does the trick. The charts are shown below.

Typically, to complement the Xbar chart, we also create a R chart (R for range).

Let’s continue with the same data and similar objective, but now the data we are dealing with has no subgroups, rather one datapoint for processing time per day over 20 days. Unlike the xbar chart data, we have only 1 sample/record per day instead of multiple samples per day.

Scenario 2: Say, Amazon.com wants to track the average time it takes to process an order: from customer placement of an order -> packaging -> delivery. So, they monitor the daily average processing time to ensure it stays within their acceptable limits.

Objective: Improve the daily processing time for a fulfillment center.

Method: I-MR chart
The UCL, LCL are calculated the same way as with xbar. The data table is slightly different this time. MR stands for Moving Range, which is calculated by the absolute difference between a day’s sample and previous day’s sample. The data looks looks this:

and with the calculated columns, we should have a new range that looks like this:

We can create the charts with UCL, LCL for processing time as follows using the data range in the columns: ‘Overall Mean’, ‘UCL’, ‘LCL’. The chart is shown below:

and create the I-MR using the datarange from columns: ‘Moving Range’, ‘Mean MR’, ‘UCL-MR’, ‘LCL-MR’. We’re still using overall Mean MR (3.157…37 in this example) +/-3 sigma (2.291…57 in this example) for UCL and LCL values. The chart is shown below:

It’s easy to see that delayed processing times especially for days 6, 7, 8, 12, 13 may need to be inspected for improvement, albeit within the acceptable limits.

Let’s look at another scenario, this involving discrete data with constant sample size.

Scenario 3: A factory wants to track the number of defects in its watch production line. So, they test for defects daily and records the number of defects found.

Objective: Lower the daily number of defects.

Method: np charts. np stands for nonconforming proportions.

Defective watches found per day shown in maroon

The data collected and the calculated columns from them are:

To create the np chart, we will use the ‘Day’ column as x-axis, and all the other columns: ‘Defective Units’, ‘Mean’, ‘UCL’, ‘LCL’ as dataseries of their own (as before, UCL and LC are Mean +/3 sigma).
However, since this is discrete data, number of defects cannot be negative, so set the LCL to zeros (lowest possible number of defects in our scenario).

The np chart is shown below:

But instead of taking the fixed sample size daily, which was assumed in the previous scenario, we took a variable number of samples each day?

Scenario 4: A factory wants to track the number of defects in its watch production line. So, they test for defects daily and records the number of defects found. Samples taken daily are variable day to day.

Objective: Lower the daily number of defects.

Method: p charts. p is for proportions.

This will require an additional step over the np chart as we’ll now need a new measure for proportion of defective units out of the sample size for each. To do this, create a new column called ‘Defect Proportions’ and enter formula to: divide number of ‘Defective Units’ / ‘Sample Size’. That will be the main line data points. As before, the UCL = Mean + 3 sigma; LCL = Mean – 3 sigma. As before, since the number of defects cannot be less than zero and therefore nor can the proporitions, we set LCL to all zeros. Let’s say, we examine 80 units on day 1, 110 units of day2, and 90 units on day 20. The data and calculated range looks like this:

And the p chart for this looks like this:

It’s easy to see which days exposed more defects and if the overall defects fall within the acceptable limits.

I hope you found this post helpful and interesting. Explore this site for more tips and articles. Be sure to also check out my Patreon site where you can find free downloads and optional fee-based code and documentation. Thanks for visiting!