A boxplot provides a graphical summary of the distribution of each sample. The boxplot makes it easy to compare the shape, the central tendency, and the variability of the samples.
Use a boxplot to examine the spread of the data and to identify any potential outliers. Boxplots are best when the sample size is greater than 20.
Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Skewed data indicates that the data might not be normally distributed. Often, skewness is easiest to detect with an individual value plot, a histogram, or a boxplot.
Data that are severely skewed can affect the validity of the p-value if your sample is small (< 20 values). If your data are severely skewed and you have a small sample, consider increasing your sample size.
Outliers, which are data values that are far away from other data values, can strongly affect your results. Often, outliers are easiest to identify on a boxplot.
Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values for abnormal, one-time events (special causes). Then, repeat the analysis.
The chi-square statistic is calculated from a table comprised of cells that are based on the groups in your data and the groups' corresponding N≤ values and N> values. Minitab calculates each cell's value as the square of the difference between the observed and expected values for a cell, divided by the expected value for that cell. The chi-square statistic is the sum of these values.
A higher chi-square value indicates that the difference between the observed and expected values is higher. A sufficiently large chi-square value indicates that at least one difference between the medians is statistically significant. Minitab uses the chi-square statistic, in conjunction with the chi-square distribution, to calculate the p-value.
You can use the chi-square statistic to determine whether to reject the null hypothesis. However, using the p-value of the test to make the same determination is usually more practical and convenient.
The degrees of freedom (DF) equals the number of groups in your data minus 1. Under the null hypothesis, chi-square distribution approximates the distribution of the test statistic, with the specified degrees of freedom. Minitab uses the chi-square distribution to estimate the p-value for this test.
An individual value plot displays the individual values in each sample. The individual value plot makes it easy to compare the samples. Each circle represents one observation. An individual value plot is especially useful when your sample size is small.
Use an individual value plot to examine the spread of the data and to identify any potential outliers. Individual value plots are best when the sample size is less than 50.
Examine the spread of your data to determine whether your data appear to be skewed. When data are skewed, the majority of the data are located on the high or low side of the graph. Skewed data indicate that the data might not be normally distributed. Often, skewness is easiest to detect with an individual value plot, a histogram, or a boxplot.
Outliers, which are data values that are far away from other data values, can strongly affect your results. Often, outliers are easy to identify on an individual value plot.
Try to identify the cause of any outliers. Correct any data-entry errors or measurement errors. Consider removing data values for abnormal, one-time events (special causes). Then, repeat the analysis.
The interquartile range (Q3 – Q1) measures the spread of the data in each group. The range is the distance between the 75th percentile (Q3) and the 25th percentile (Q1).
Interquartile ranges that differ substantially indicate that the groups do not have the same spread. This condition suggests that the data may not satisfy the assumption for Mood's median test that the groups have the same shape and spread.
An interval plot displays confidence intervals for the groups in your data. These confidence intervals (CI) are ranges of values that are likely to contain the true median of each population.
Because samples are random, two samples from a population are unlikely to yield identical confidence intervals. But, if you repeat your sample many times, a certain percentage of the resulting confidence intervals contain the unknown population parameter. The percentage of these confidence intervals that contain the parameter is the confidence level of the interval.
Use the confidence interval to assess the estimate of the population median for each group.
For example, with a 95% confidence level, you can be 95% confident that the confidence interval contains the group median. The confidence interval helps you assess the practical significance of your results. Use your specialized knowledge to determine whether the confidence interval includes values that have practical significance for your situation. If the interval is too wide to be useful, consider increasing your sample size.
In the interval plot, a temperature of 46 is associated with the heaviest weights. However, the test results are insignificant, which indicates that the observed differences are likely to be random error. You cannot determine from this graph whether any differences are statistically significant. To determine statistical significance, assess the p-value for the test.
The median is the midpoint of the data set. This midpoint value is the point at which half of the observations are above the value and half of the observations are below the value. The median is determined by ranking the observations and finding the observation at the number [N + 1] / 2 in the ranked order. If your data contain an even number of observations, the median is the average value of the observations that are ranked at numbers N / 2 and [N / 2] + 1.
The sample median is an estimate of the population median of each group. The overall median is the median of all observations.
N≤ (less than or equal to the overall median) is the number of observations in each group that are less than or equal to the overall median. Minitab creates a table with the N≤ values and the N> values. Minitab uses these values to perform the chi-square test of association and to calculate the p-value for the test.
If a group has a large number of observations in this category, the median of the group is likely to be less than the overall median.
N> (greater than the overall median). These values represent the number of observations in each group that are greater than the overall median. Minitab creates a table with the N≤ values and the N> values. Minitab uses these values to perform the chi-square test of association and to calculate the p-value for the test.
If a group has a large number of observations in this category, the median of the group is likely to be greater than the overall median.
The p-value is a probability that measures the evidence against the null hypothesis. Lower probabilities provide stronger evidence against the null hypothesis.
Use the p-value to determine whether any of the differences between the medians are statistically significant.