# Analyzing Data

## Overview

The field of statistics provides principles and methods for collecting, summarizing, and analyzing data, and for interpreting the results. You use statistics to describe data and make inferences. Then, you use the inferences to improve processes and products.

Minitab provides many statistical analyses, such as regression, ANOVA, quality tools, and time series. Built-in graphs help you visualize your data and validate your results. In Minitab, you can also display and store statistics and diagnostic measures.

In this chapter, you assess the number of late orders and back orders, and test whether the differences in delivery times between the three shipping centers are statistically significant.

## Summarize the data

Descriptive statistics summarize and describe the prominent features of data. Use Display Descriptive Statistics to determine how many book orders were delivered on time, how many were late, and how many were initially back ordered for each shipping center.

### Display descriptive statistics

1. Open the sample data, ShippingData.MTW.
2. Choose Stat > Basic Statistics > Display Descriptive Statistics.
3. In Variables, enter Days.
4. In By variables (optional), enter Center Status.
For most Minitab commands, you only need to complete the main dialog box to execute the command. Often, you use sub-dialog boxes to modify the analysis or to display additional output, such as graphs.
5. Click Statistics.
6. Uncheck First quartile, Median, Third quartile, N nonmissing, and N missing.
7. Check N total.
8. Click OK in each dialog box.
###### Note

Changes that you make in the Statistics sub-dialog box affect the current session only. To change the default options for future sessions, choose Tools > Options. Expand Individual Commands and choose Display Descriptive Statistics. Choose the statistics that you want to display. When you open the Statistics sub-dialog box again, it displays your new options.

### Results for Center = Central

Statistics Total Variable Status Count Mean SE Mean StDev Minimum Maximum Days Back order 6 * * * * * Late 6 6.431 0.157 0.385 6.078 7.070 On time 93 3.826 0.119 1.149 1.267 5.983

### Results for Center = Eastern

Statistics Total Variable Status Count Mean SE Mean StDev Minimum Maximum Days Back order 8 * * * * * Late 9 6.678 0.180 0.541 6.254 7.748 On time 92 4.234 0.112 1.077 1.860 5.953

### Results for Center = Western

Statistics Total Variable Status Count Mean SE Mean StDev Minimum Maximum Days Back order 3 * * * * * On time 102 2.981 0.108 1.090 0.871 5.681
###### Note

The Session window displays text output, which you can send to Microsoft Word and Microsoft PowerPoint. For more information on sending output to PowerPoint, go to Presenting Results from Minitab.

### Interpret the results

The Session window displays each center’s results separately. Within each center, you can see the number of back orders, late orders, and on-time orders in the Total Count column:

• The Eastern shipping center has the most back orders (8) and late orders (9).
• The Central shipping center has the next most back orders (6) and late orders (6).
• The Western shipping center has the fewest back orders (3) and no late orders.

The Session window output also includes the mean, standard error of the mean, standard deviation, minimum, and maximum of delivery time in days for each center. These statistics do not exist for back orders.

## Compare two or more means

One of the most common methods used in statistical analysis is hypothesis testing. Minitab offers many hypothesis tests, including t-tests and ANOVA (analysis of variance). Usually, when you perform a hypothesis test, you assume an initial claim to be true, and then test this claim using sample data.

Hypothesis tests include two hypotheses (claims), the null hypothesis (H0) and the alternative hypothesis (H1). The null hypothesis is the initial claim and is often specified based on previous research or common knowledge. The alternative hypothesis is what you believe might be true.

Given the graphical analysis in the previous chapter and the descriptive analysis above, you suspect that the difference in the average number of delivery days across shipping centers is statistically significant. To verify this, you perform a one-way ANOVA, which tests the equality of two or more means. You also perform a Tukey’s multiple comparison test to see which shipping center means are different. For this one-way ANOVA, delivery days is the response, and shipping center is the factor.

### Perform an ANOVA

1. Choose Stat > ANOVA > One-Way.
2. Choose Response data are in one column for all factor levels.
3. In Response, enter Days. In Factor, enter Center.
4. Click Comparisons.
5. Under Comparison procedures assuming equal variances, check Tukey.
6. Click OK.
7. Click Graphs. For many statistical commands, Minitab includes graphs that help you interpret the results and assess the validity of statistical assumptions. These graphs are called built-in graphs.
8. Under Data plots, check Interval plot, Individual value plot, and Boxplot of data.
9. Under Residual plots, choose Four in one.
10. Click OK in each dialog box.

### Interpret the Session window output

The decision-making process for a hypothesis test is based on the p-value, which indicates the probability of falsely rejecting the null hypothesis when it is really true.

• If the p-value is less than or equal to a predetermined significance level (denoted by α or alpha), then you reject the null hypothesis and claim support for the alternative hypothesis.
• If the p-value is greater than α, then you fail to reject the null hypothesis and cannot claim support for the alternative hypothesis.

With an α of 0.05, the p-value (0.000) in the Analysis of Variance table provides enough evidence to conclude that the average delivery times for at least two of the shipping centers are significantly different.

The results of the Tukey's test are included in the grouping information table, which highlights the significant and non-significant comparisons. Because each shipping center is in a different group, all shipping centers have average delivery times that are significantly different from each other.

### Interpret the ANOVA graphs

Minitab produced the following graphs:
• Four-in-one residual plot
• Interval plot
• Individual value plot
• Boxplot
• Tukey 95% confidence interval plot

You examine the residual plots first. Then, you examine the interval plot, individual value plot, and boxplot together to assess the equality of the means. Finally, you examine the Tukey 95% confidence interval plot to determine statistical significance.

#### Interpret the residual plots

Use residual plots, which are available with many statistical commands, to verify statistical assumptions.
Normal Probability Plot
Use this plot to detect nonnormality. Points that approximately follow a straight line indicate that the residuals are normally distributed.
Histogram
Use this plot to detect multiple peaks, outliers, and nonnormality. Look for a normal histogram, which is approximately symmetric and bell-shaped.
Versus Fits
Use this plot to detect nonconstant variance, missing higher-order terms, and outliers. Look for residuals that are scattered randomly around zero.
Versus Order
Use this plot to detect the time dependence of the residuals. Inspect the plot to ensure that the residuals display no obvious pattern.

For the shipping data, the four-in-one residual plots indicate no violations of statistical assumptions. The one-way ANOVA model fits the data relatively well.

###### Note

In Minitab, you can display each of the residual plots on a separate page.

#### Interpret the interval plot, individual value plot, and boxplot

Examine the interval plot, individual value plot, and boxplot. Each graph indicates that the delivery time varies by shipping center, which is consistent with the histograms from the previous chapter. The boxplot for the Eastern shipping center has an asterisk. The asterisk identifies an outlier. This outlier is an order that has an unusually long delivery time.

Examine the interval plot again. The interval plot displays 95% confidence intervals for each mean. Hold the pointer over the points on the graph to view the means. Hold the pointer over the interval bars to view the 95% confidence intervals. The interval plot shows that the Western shipping center has the fastest mean delivery time (2.981 days) and a confidence interval of 2.75 to 3.22 days.

#### Interpret the Tukey 95% confidence interval plot

The Tukey 95% confidence interval plot is the best graph to use to determine the likely ranges for the differences and to assess the practical significance of those differences. The Tukey confidence intervals show the following pairwise comparisons:
• Eastern shipping center mean minus Central shipping center mean
• Western shipping center mean minus Central shipping center mean
• Western shipping center mean minus Eastern shipping center mean

Hold the pointer over the points on the graph to view the middle, upper, and lower estimates. The interval for the Eastern minus Central comparison is 0.068 to 0.868. That is, the mean delivery time of the Eastern shipping center minus the mean delivery time of the Central shipping center is between 0.068 and 0.868 days. The Eastern shipping center's deliveries take significantly longer than the Central shipping center's deliveries. You interpret the other Tukey confidence intervals similarly. Also, notice the dashed line at zero. If an interval does not contain zero, the corresponding means are significantly different. Therefore, all the shipping centers have significantly different average delivery times.

### Access Key Results

Suppose you want more information about how to interpret a one-way ANOVA, specifically Tukey’s multiple comparison method. Minitab provides detailed information about the Session window output and graphs for most statistical commands.

1. Put your cursor anywhere in the one-way ANOVA Session window output.
2. On the Standard toolbar, click the Help button .

### Save the project

Save all your work in a Minitab project.

1. Choose File > Save Project As.
2. Navigate to the folder that you want to save your files in.
3. In File name, enter MyStats.
4. Click Save.

## Use Minitab’s Project Manager

Now you have a Minitab project that contains a worksheet, several graphs, and Session window output from your analyses. The Project Manager helps you navigate, view, and manipulate parts of your Minitab project.

Use the Project Manager to view the statistical analyses that you just performed.

### View the Session window output

Use the Project Manager to review the one-way ANOVA Session window output.

1. On the Project Manager toolbar, click the Show Session Folder button .
2. In the left pane, double-click One-way ANOVA: Days versus Center.

The Project Manager displays the one-way ANOVA session window output in the right pane.

### View the graphs

You want to view the boxplot again. You can double-click Boxplot of Days in the Session folder or use the Show Graphs Folder button on the toolbar.

1. On the Project Manager toolbar, click the Show Graphs Folder button .
2. In the left pane, double-click Boxplot of Days.

The Project Manager displays the boxplot in the Graph window.

## In the next chapter

The descriptive statistics and ANOVA results indicate that the Western shipping center has the fewest late orders and back orders, and has the shortest delivery time. In the next chapter, you create a control chart and perform a capability analysis to investigate whether the Western shipping center’s process is stable over time and is capable of operating within specifications.

By using this site you agree to the use of cookies for analytics and personalized content.  Read our policy