Use the mean to describe the sample with a single value that represents the center of the data. Many statistical analyses use the mean as a standard measure of the center of the distribution of the data.
The median is another measure of the center of the distribution of the data. The median is usually less influenced by outliers than the mean. Half the data values are greater than the median value, and half the data values are less than the median value.
The confidence interval provides a range of likely values for the population parameter. For example, a 95% confidence level indicates that if you take 100 random samples from the population, you could expect approximately 95 of the samples to produce intervals that contain the population parameter.
Use the histogram and boxplot to assess the shape and spread of the data, and to identify any potential outliers.
When data are skewed, the majority of the data are located on the high or low side of the graph. Often, skewness is easiest to detect with a histogram or boxplot.
Outliers, which are data values that are far away from other data values, can strongly affect the results of your analysis. Often, outliers are easiest to identify on a boxplot.
Try to identify the cause of any outliers. Correct any data–entry errors or measurement errors. Consider removing data values for abnormal, one-time events (also called special causes). Then, repeat the analysis. For more information, go to Identifying outliers.
Multi-modal data have multiple peaks, also called modes. Multi-modal data often indicate that important variables are not yet accounted for.
If you have additional information that allows you to classify the observations into groups, you can create a group variable with this information. Then, you can create the graph with groups to determine whether the group variable accounts for the peaks in the data.