Interpret all statistics and graphs for Kruskal-Wallis Test

Find definitions and interpretation guidance for every statistic and graph that is provided with Kruskal-Wallis.

Average rank (Ave Rank)

The average rank is the average of the ranks for all observations within each sample. Minitab uses the average rank to calculate the H statistic, which is the test statistic for the Kruskal-Wallis test.

To calculate the average rank, Minitab ranks the combined samples. Minitab assigns the smallest observation a rank of 1, the second smallest observation a rank of 2, and so on. If two or more observations are tied, Minitab assigns the average rank to each tied observation. Minitab calculates the average rank for each sample.

Interpretation

When a group's average rank is higher than the overall average rank, the observation values in that group tend to be higher than those of the other groups.

Boxplot

A boxplot provides a graphical summary of the distribution of each sample. The boxplot makes it easy to compare the shape, the central tendency, and the variability of the samples.

Interpretation

Use a boxplot to examine the spread of the data and to identify any potential outliers. Boxplots are best when the sample size is greater than 20.

Skewed data

Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Skewed data indicates that the data might not be normally distributed. Often, skewness is easiest to detect with an individual value plot, a histogram, or a boxplot.

Right-skewed
Left-skewed

The boxplot with right-skewed data shows average wait times. Most of the wait times are relatively short, and only a few of the wait times are longer. The boxplot with left-skewed data shows failure rate data. A few items fail immediately, and many more items fail later.

Data that are severely skewed can affect the validity of the p-value if your sample is small (< 20 values). If your data are severely skewed and you have a small sample, consider increasing your sample size.

Outliers

Outliers, which are data values that are far away from other data values, can strongly affect your results. Often, outliers are easiest to identify on a boxplot.

On a boxplot, asterisks (*) denote outliers.

Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values for abnormal, one-time events (special causes). Then, repeat the analysis.

DF

The degrees of freedom (DF) equals the number of groups in your data minus 1. Under the null hypothesis, chi-square distribution approximates the distribution of the test statistic, with the specified degrees of freedom. Minitab uses the chi-square distribution to estimate the p-value for this test.

H

H is the test statistic for the Kruskal-Wallis test. Under the null hypothesis, the chi-square distribution approximates the distribution of H. The approximation is reasonably accurate when no group has fewer than five observations.

Interpretation

Minitab uses the test statistic to calculate the p-value, which you use to make a decision about the statistical significance of the terms and the model. The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

A sufficiently high test statistic indicates that at least one difference between the medians is statistically significant.

You can use the test statistic to determine whether to reject the null hypothesis. However, using the p-value of the test to make the same determination is usually more practical and convenient.

Individual value plot

An individual value plot displays the individual values in each sample. The individual value plot makes it easy to compare the samples. Each circle represents one observation. An individual value plot is especially useful when your sample size is small.

Interpretation

Use an individual value plot to examine the spread of the data and to identify any potential outliers. Individual value plots are best when the sample size is less than 50.

Skewed data

Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Skewed data indicate that the data might not be normally distributed. Often, skewness is easiest to detect with an individual value plot, a histogram, or a boxplot.

Right-skewed
Left-skewed

The individual value plot with right-skewed data shows wait times. Most of the wait times are relatively short, and only a few wait times are longer. The individual value plot with left-skewed data shows failure time data. A few items fail immediately, and many more items fail later.

Outliers

Outliers, which are data values that are far away from other data values, can strongly affect your results. Often, outliers are easy to identify on an individual value plot.

On an individual value plot, unusually low or high data values indicate potential outliers.

Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values for abnormal, one-time events (special causes). Then, repeat the analysis.

Median

The median is the midpoint of the data set. This midpoint value is the point at which half of the observations are above the value and half of the observations are below the value. The median is determined by ranking the observations and finding the observation at the number [N + 1] / 2 in the ranked order. If your data contain an even number of observations, the median is the average value of the observations that are ranked at numbers N / 2 and [N / 2] + 1.

Interpretation

The sample median is an estimate of the population median of each group. The overall median is the median of all observations.

N

The sample size (N) is the total number of observations in each group.

Interpretation

The sample size affects the confidence interval and the power of the test.

Usually, a larger sample yields a narrower confidence interval. A larger sample size also gives the test more power to detect a difference. For more information, go to What is power?.

P-value

The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.

Interpretation

Use the p-value to determine whether any of the differences between the medians are statistically significant.

To determine whether any of the differences between the medians are statistically significant, compare the p-value to your significance level to assess the null hypothesis. The null hypothesis states that the population medians are all equal. Usually, a significance level (denoted as α or alpha) of 0.05 works well. A significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference.
P-value ≤ α: The differences between some of the medians are statistically significant
If the p-value is less than or equal to the significance level, you reject the null hypothesis and conclude that not all the group medians are equal. Use your specialized knowledge to determine whether the differences are practically significant. For more information, go to Statistical and practical significance.
P-value > α: The differences between the medians are not statistically significant
If the p-value is greater than the significance level, you do not have enough evidence to reject the null hypothesis that the group medians are all equal. Verify that your test has enough power to detect a difference that is practically significant. For more information, go to Increase the power of a hypothesis test.

Z

The z-value indicates how the average rank for each group compares to the average rank of all observations.

Interpretation

Interpret the z-values for each group as follows:
  • The higher the absolute value, the further a group's average rank is from the overall average rank.
  • A negative z-value indicates that a group's average rank is less than the overall average rank.
  • A positive z-value indicates that a group's average rank is greater than the overall average rank.
By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy