Use a boxplot to provide a static picture of the location and spread of the Y variable (the process output) by showing the minimum and maximum values, first quartile (25% of points are less than this value), third quartile (75% of points are less than this value), median (or mean), and potential outliers. If you also include a categorical X variable, you can look at the location and spread of the Y at each level (for example, factor setting) of the X variable.
Answers the questions:
- What is the general location of the Y data?
- How wide is the spread of the Y data?
- Does the sample contain any unusual data points (outliers)?
- Does changing the level of an input variable (X) affect the location or the spread of the output Y?
|When to Use
||The first rule in data analysis is to always plot your data before running any statistical tests. The boxplot is a logical choice for comparison tests where you are looking at what happens to the process output under various conditions, such as changes to a process input.
||Assess if an input (X) has an impact on the process mean or process variation and help eliminate noncritical X's from consideration.
||Identify levels (settings) of the process input that have the desired impact on the output mean or variation.
||Communicate the effects of process inputs on the process output to project stakeholders.
Your data must be a numeric value for Y, with an optional discrete value for X (categories for comparison).
- The boxplot is very prone to misinterpretation when the sample size is small. When the sample size is less than 20, use a dotplot or individual value plot.
- The boxplot provides a good visual comparison even when the number of levels of an X variable are high. If the number of levels of an X variable is greater than five, the boxplot provides a better visual comparison than the dotplot (assuming more than 20 points per category).
- Choose from one of two common data layouts that you can use with boxplots:
- Choose a boxplot with groups (stacked data) when you enter one column for the Y variable and one for the X (categorical) variable (optional). Note: You can have up to four categorical variables. Minitab draws a separate box for each combination of levels of the categorical variables; however, the boxes all appear in the same graph window. This display is handy for making comparisons across levels of X variables.
- Choose a boxplot with multiple Ys (unstacked data) when you enter the Y data into a separate column for each level of the X variable. Minitab draws a separate box for each Y. The boxes can be plotted either in separate graph windows or in the same graph window with a common scale.
For more information, go to Insert an analysis capture tool.