Interpret the key results for Stem-and-Leaf Plot

Complete the following steps to interpret a stem-and-leaf plot.

Step 1: Assess the key characteristics

Examine the center and spread of the distribution. Assess how the sample size may affect the appearance of the stem-and-leaf plot.

Center and spread

Examine the following elements to learn more about your sample data.
Counts and median
The counts are in the first column on the left. The count for the row that contains the median value is enclosed in parentheses. The values for rows above and below the median are cumulative. The count for a row above the median represents the total count for that row and all the rows above it. The value for a row below the median represents the total count for that row and all the rows below it.
Data values

For each row, the number in the "stem" (the middle column) represents the first digit (or digits) of the sample values. The "leaf unit" at the top of the plot indicates which decimal place the leaf values represent.

Spread
The spread shows how much your data vary.

This stem-and-leaf plot shows customer wait times for an online customer service chat with a representative. The first row has a stem value of 8 and contains the leaf values 0, 2, and 3. The leaf unit is 1. Thus, the first row of the plot represents sample values of approximately 80, 82, and 83. The values range from 80 seconds to 119 seconds. The median is in the row that has values between 95 seconds and 99 seconds.

Stem-and-leaf of C1   N = 50

38023
8856688
2190111111222444
(6)9555799
23100000111233
131055667789
51114
311579
Leaf Unit = 1

Investigate any surprising or undesirable characteristics. For example, the stem-and-leaf plot of customer wait times showed higher values and a larger spread than expected. An investigation revealed that unusually heavy web traffic caused instability and delays.

Sample size (n)

The sample size can affect the appearance of the graph.

The sample size is displayed at the top of the stem-and-leaf plot. In the previous example, the sample size is 50 (N = 50).

Because a stem-and-leaf plot represents each data value, it is best when the sample size is less than approximately 50. If the sample is greater than 50, the data points on the plot may extend too far, and the distribution may be difficult to assess. If you have more than 50 data points, consider using a boxplot or a histogram.

Step 2: Look for indicators of nonnormal or unusual data

Skewed data and multi-modal data indicate that data may be nonnormal. Outliers may indicate other conditions in your data.

Skewed data

Determine whether your data are skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Skewness indicates that the data may not be normally distributed. Often, skewness is easiest to detect with a histogram or a boxplot.

These stem-and-leaf plots illustrate skewed data. The stem-and-leaf plot with right-skewed data shows wait times. Most of the wait times are relatively short, and only a few wait times are long. The stem-and-leaf plot with left-skewed data shows failure time data. A few items fail immediately and many more items fail later.

Stem-and-leaf of C1   N = 50

1-04
6-033222
16-01111111111
(16)00000000011111111
18022222333333
704555
306
20 
21 
212
114
Leaf Unit = 0.1

Right-skewed

Stem-and-leaf of C1   N = 52

3-1333
3-1 
5-099
6-06
8-044
24-03333333322222222
(7)-01111111
210000001111111
9022233
40445
106
Leaf Unit = 0.1

Left-skewed

If you know that your data are not naturally skewed, investigate possible causes. If you want to analyze severely skewed data, read the data considerations topic for the analysis to make sure that you can use data that are not normal.

Outliers

Outliers, which are data valuse that are far away from other data values, can strongly affect your results.

On a stem-and-leaf plot, isolated values at the ends identify possible outliers. For example, the last value at the bottom of this plot could be an outlier.

Stem-and-leaf of C1   N = 31

2-220
4-152
(13)-08886555433300
14000334688
610046
225
13 
14 
15 
16 
17 
180
Leaf Unit = 0.1

Try to identify the cause of any outliers. Correct any data entry errors. Consider removing data values that are associated with abnormal, one-time events (special causes). Then, repeat the analysis.

Multi-modal data

Multi-modal data have more than one peak. (A peak represents the mode of a set of data.) Multi-modal data usually occur when the data are collected from more than one process or condition, such as at more than one temperature.

For example, these stem-and-leaf plots are graphs of the same data. The simple stem-and-leaf plot has two clusters of points, but it's not clear what the clusters mean. The stem-and-leaf plot with groups shows that the clusters correspond to two groups.

Stem-and-leaf of C1   N = 100

2718
58589
2190122235555677889
37100122233334556778
(14)1113334455667789
49122599
45130012334667778888888
2614000011122236777888
8150245779
1161
Leaf Unit = 0.1

Simple

Stem-and-leaf of C1    C2 = 1    N = 50

21159
512259
24130012334667778888888
(18)14000011122236777888
8150245779
1161
Leaf Unit = 0.1

Stem-and-leaf of C1    C2 = 2    N = 50

2718
58589
2190122235555677889
(16)100122233334556778
1311133344566778
1129
Leaf Unit = 0.1

With groups

If you have additional information that allows you to classify the observations into groups, you can create a group variable with this information. Then, you can create the graph with groups to determine whether the group variable accounts for the peaks in the data.