# Interpret the key results for Dotplot

Complete the following steps to interpret a dotplot.

## Step 1: Assess the key characteristics

Examine the peaks and spread of the distribution. Assess how the sample size may affect the appearance of the dotplot.

Identify the peaks, which are the bins that have the most dots. The peaks represent the most common values in the sample. Assess the spread of your sample to understand how much your data varies.

For example, in the following dotplot of customer wait times, the peak of the data occurs at about 6 minutes. The data spread is from about 3.5 minutes to 8.5 minutes.

Investigate any surprising or undesirable characteristics on the dotplot. For example, the dotplot of customer wait times showed spread that is wider than expected. An investigation revealed that a software update to the computers caused instability and delays in customer wait times.

### Sample size (N)

The sample size can affect the appearance of the graph. For example, although the following dotplots seem quite different, both of them were created using randomly selected samples of data from the same population. On the first dotplot, each symbol represents one observation. On the second dotplot, each symbol represents up to three observations.

A dotplot is best when the sample size is less than approximately 50. If the sample size is 50 or greater, a dot may represent more than one observation. Consider using Boxplot or Histogram in addition to the dotplot so that you can more easily identify primary characteristics of the distribution.

## Step 2: Look for indicators of nonnormal or unusual data

Skewed data and multi-modal data indicate that data may be nonnormal. Outliers may indicate other conditions in your data.

### Skewed data

When data are skewed, the majority of the data are located on the high or low side of the graph. Skewness indicates that the data may not be normally distributed. Often, skewness is easiest to detect with a histogram or a boxplot.

The following dotplots are skewed. The dotplot with right-skewed data shows wait times. Most of the wait times are relatively short, and only a few wait times are long. The dotplot with left-skewed data shows failure time data. A few items fail immediately and many more items fail later.

Some analyses assume that your data come from a normal distribution. If your data are skewed (nonnormal), read the data considerations topic for the analysis to make sure that you can use data that are not normal.

### Outliers

Outliers, which are data values that are far away from other data values, can strongly affect your results. On a dotplot, unusually low or high data values identify possible outliers.

Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values that are associated with abnormal, one-time events (special causes). Then, repeat the analysis.

### Multi-modal data

Multi-modal data have multiple peaks, also called modes. Multi-modal data often indicate that important variables are not yet accounted for.

If you have additional information that allows you to classify the observations into groups, you can create a group variable with this information. Then, you can create the graph with groups to determine whether the group variable accounts for the peaks in the data.

For example, a manager at a bank collects wait time data and creates a simple dotplot. The dotplot appears to have two peaks. Upon further investigation, the manager determines that the wait times for customers who are cashing checks is shorter than the wait time for customers who are applying for home equity loans. The manager adds a group variable for customer task, and then creates a dotplot with groups. The dotplot with groups shows that the peaks correspond to the two groups.

## Step 3: Assess and compare groups

If your dotplot has groups, assess and compare the center and spread of groups.

### Centers

Look for differences between the centers of the groups. For example, the following dotplot shows the completion time for four versions of a credit card application. The mean completion times for some versions seem to be different.
To determine whether a difference in means is statistically significant, do one of the following: