Examine the center and spread of the distribution. Assess how the sample size may affect the appearance of the individual value plot.
Center and spread
Identify the densest clusters of symbols. The densest clusters represent the most common values. Assess the spread of each group to understand how much your data varies. Hold the pointer over any data point for a tooltip that describes the observation. For example, the following individual value plot shows the diameters of plastic pipes. The spread increases each week.
Investigate any surprising or undesirable characteristics on the individual value plot. For example, an individual value plot of hardness measurements from a shipment of ball bearings shows a wider than normal spread of values. An investigation revealed that a change in the ball bearing manufacturing process caused the increase in variability.
Sample size (N)
The sample size can affect the appearance of the graph. For example, although the following individual value plots seem quite different, both of them were created using randomly selected samples of data from the same population.
An individual value plot works best when the sample size is less than approximately 50. If the sample is too large, the data points on the plot may be too densely packed together and the distribution may be difficult to assess. If the sample size is greater than 50, consider using Boxplot or Histogram instead.
Step 2: Look for indicators of nonnormal or unusual data
Skewed data and multi-modal data indicate that data may be nonnormal. Outliers may indicate other conditions in your data.
When data are skewed, the majority of the data are located on the high or low side of the graph. Skewness indicates that the data may not be normally distributed. Often, skewness is easiest to detect with a histogram or a boxplot.
The following individual value plots are skewed. The individual value plot with right-skewed data shows wait times. Most of the wait times are relatively short, and only a few wait times are long. The individual value plot with left-skewed data shows failure time data. A few items fail immediately and many more items fail later.
Some analyses assume that your data come from a normal distribution. If your data are skewed (nonnormal), read the data considerations topic for the analysis to make sure that you can use data that are not normal.
Outliers, which are data values that are far away from other data values, can strongly affect your results. On an individual value plot, unusually low or high data values identify possible outliers.
Hold the pointer over the outlier to identify the data point.
Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values that are associated with abnormal, one-time events (special causes). Then, repeat the analysis.
Multi-modal data have multiple clusters, also called modes. Multi-modal data often indicate that important variables are not yet accounted for.
If you have additional information that allows you to classify the observations into groups, you can create a group variable with this information. Then, you can create the graph with groups to determine whether the group variable accounts for the peaks in the data.
For example, a manager at a bank collects wait time data and creates a simple individual value plot. The individual value plot appears to have two clusters. The manager investigates the wait times and discovers that customers who cash a check wait longer than customers who apply for a loan. The manager adds a group variable for customer task, and then creates an individual value plot with groups. The individual value plot with groups shows that the clusters correspond to the two groups.
Step 3: Assess and compare groups
If your individual value plot has groups, assess and compare the center and spread of the groups.
Look for differences between the centers of the groups. For example, the following individual value plot shows the thickness of wire from three suppliers. The mean symbols and the mean connect lines show the center of the distribution for each group and the differences between groups. The centers for wire thicknesses from each supplier seem to be different.
To determine whether a difference in means is statistically significant, do one of the following:
Look for differences between the spreads of the groups. For example, the following individual value plot shows the fill weights of cereal boxes from three production lines. Although the distributions of weight have almost the same center, the weights of some groups are more variable than others.