The sample size (N) is the number of plotted points for a pair of variables. N does not include rows for which either the X-value or the Y-value are missing.
The mean is the average of the data, which is the sum of all the observations divided by the number of observations.
For example, the wait times (in minutes) of five customers in a bank are: 3, 2, 4, 1, and 2. The mean waiting time is calculated as follows:
On average, a customer waits 2.4 minutes for service at the bank.
Use the mean to describe the sample with a single value that represents the center of the data. Many statistical analyses use the mean as a standard measure of the center of the distribution of the data.
The standard deviation is the most common measure of dispersion, or how spread out the data are about the mean.
For a normal distribution, approximately 68% of the values fall within one standard deviation of the mean, 95% of the values fall within two standard deviations, and 99.7% of the values fall within three standard deviations.
The minimum is the smallest data value.
In these data, the minimum is 7.
Use the minimum to identify a possible outlier or a data-entry error. One of the simplest ways to assess the spread of your data is to compare the minimum and maximum. If the minimum value is very low, even when you consider the center, the spread, and the shape of the data, investigate the cause of the extreme value.
The maximum is the largest data value.
In these data, the maximum is 19.
Use the maximum to identify a possible outlier or a data-entry error. One of the simplest ways to assess the spread of your data is to compare the minimum and maximum. If the maximum value is very high, even when you consider the center, the spread, and the shape of the data, investigate the cause of the extreme value.
R2 is the percentage of variation in the response that is explained by the model.
Use R2 to determine how well the model fits your data. The higher the R2 value, the better the model fits your data. R2 is always between 0% and 100%.
The first plot illustrates a simple regression model that explains 85.5% of the variance in the response. The second plot illustrates a model that explains 22.6% of the variance in the response. The more variance that is explained by the model, the closer the data points fall to the fitted regression line.
The equation describes the relationship between the Y variable and the X variable.
For example, a company determines that scores on a job skills test can be predicted using a linear model with the following equation: Score = 130 + 4.3 Training Hours. The intercept is 130, which indicates the average score for an employee with zero hours of training. The coefficient is 4.3, which indicates that, for each hour of training, the employee's score increases, on average, by 4.3 points.