Mean

Use the mean to describe an entire set of observations with a single value representing the center of the data. Many statistical analyses use the mean as a standard reference point. The mean is the sum of all observations divided by the number of observations.

For example, the waiting time (in minutes) of five customers in a bank are: 3, 2, 4, 1, and 2. The mean waiting time is:
On average, a customer waits 2.4 minutes for service at the bank.

Median

Use the median to describe an entire set of observations with a single value representing the center of the data. Half of the observations are above the median, half are below it. It is determined by ranking the data and finding observation number [N + 1] / 2. If there are an even number of observations, the median is extrapolated as the value midway between that of observation numbers N / 2 and [N / 2] + 1.

For this ordered data, the median is 13. That is, 50% of the values are less than or equal to 13, and 50% of the values are greater than or equal to 13.

Mode

The mode is the value that occurs most frequently in a set of observations. Minitab also displays how many data points equal the mode. Mode may be used with mean and median to give an overall characterization of your data distribution. While the mean and median require a calculation, the mode is found simply by counting the number of times each value occurs in a data set.

Identifying the mode can help you understand your distribution. A distribution with more than one mode may indicate that you actually have sampled from a mixed population. For example, you may have collected wait time data on customers who are cashing checks and customers who are applying for home equity loans together. To better understand your data, these two cases should be collected separately. If you have more than two modes, the distribution is multimodal.
Unimodal

There is only one mode, 8, that occurs most frequently.

Bimodal

There are two modes, 4 and 16. The data seem to represent 2 different populations.

Trimmed Mean

The trimmed mean is the mean of the data, without the highest 5% and lowest 5% of the values. Use the trimmed mean to eliminate the impact of very large or very small values on the mean. When the data contain outliers, the trimmed mean may be a better measure of central tendency than the mean.

The blue line represents the original mean, which is strongly influenced by the extreme values on the far right. The red line represents the trimmed mean, which shifts to the left because Minitab excludes the extreme values in the highest 5% of the data.

Using measures of central tendency to describe skewed distributions

The center of the data is the area where most values in a data set cluster. Central tendency can be described by a number of different statistics, like the mean, trimmed mean, median, or mode. Knowing the central tendency of your data is an important first step in understanding it.

Graphs like histograms, boxplots, and dotplots are useful in visualizing data's central tendency and can assist in deciding which central tendency statistic is most appropriate with a given data set.

In a very large normally distributed data set, different measures of center are all essentially the same.

However, as distributions stray from normal, these statistics begin to separate. In this example, the reference lines (from left to right) represent the median, trimmed mean, and mean. In this case, the median is most appropriate but it may not always be.

Likewise, as distributions stray from normal and become more skewed, the standard deviation becomes more different from the distance between the mean and a typical data value.

The interquartile range is a better measure of spread for highly skewed data than the standard deviation is because the interquartile range is not affected by extreme ranges.

Comparing the mean and the median

If your data are symmetric, the measures of central tendency (mean and median) will be roughly the same. If the data are asymmetric, the measures may be pulled toward the more extreme observations. Of the measures, the mean is more influenced by extreme values and the median is less influenced.

For example, this distribution is positively skewed. Notice that the mean (X) is pulled to the right in the direction of the skew. The median (Y) is farther left, closer to the majority of the observations. In this case, the median may be a better way to describe the center of the data than the mean is.

How can I display these statistics?

You can use Display Descriptive Statistics to display the mean, median, mode, and trimmed mean. For example, suppose you want to display the mode for the values in C1.
  1. Choose Stat > Basic Statistics > Display Descriptive Statistics.
  2. In Variables enter C1.
  3. Click Statistics. Check Mode (and any other statistic you may want).
  4. Click OK in each dialog box.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy