Methods and formulas for the fits and residuals in Analyze Factorial Design

Fit

Notation

TermDescription
fitted value
xkkth term. Each term can be a single predictor, a polynomial term, or an interaction term.
bkestimate of kth regression coefficient

Residuals

The residual is the difference between an observed value and the corresponding fitted value. This part of the observation is not explained by the model. The residual of an observation is:

Notation

TermDescription
yiith observed response value
ith fitted value for the response

Standardized residual (Std Resid)

Standardized residuals are also called "internally Studentized residuals."

Formula

Notation

TermDescription
ei i th residual
hi i th diagonal element of X(X'X)–1X'
s2 mean square error
Xdesign matrix
X'transpose of the design matrix

Deleted (Studentized) residuals

Also called the externally Studentized residuals. The formula is:

Another presentation of this formula is:

The model that estimates the ith observation omits the ith observation from the data set. Therefore, the ith observation cannot influence the estimate. Each deleted residual has a student's t-distribution with degrees of freedom.

Notation

TermDescription
eiith residual
s(i)2mean square error calculated without the ith observation
hi i th diagonal element of X(X'X)–1X'
nnumber of observations
pnumber of terms, including the constant
SSEsum of squares for error

Whole plot residuals

The part of the observation due to the whole variation (after accounting for model terms) in a split-plot design.

Notation

TermDescription
fitted value for the full model (including the whole plot error term as well as fixed terms)
fitted value using only the fixed effects terms, not the whole plot error term

Standard error of fitted value (SE Fit)

The standard error of the fitted value in a regression model with one predictor is:

The standard error of the fitted value in a regression model with more than one predictor is:

For weighted regression, include the weight matrix in the equation:

When the data have a test data set or K-fold cross validation, the formulas are the same. The value of s2 is from the training data. The design matrix and the weight matrix are also from the training data.

Notation

TermDescription
s2mean square error
nnumber of observations
x0new value of the predictor
mean of the predictor
xiith predictor value
x0 vector of values that produce the fitted values, one for each column in the design matrix, beginning with a 1 for the constant term
x'0transpose of the new vector of predictor values
Xdesign matrix
Wweight matrix

Standard error of fitted values (SE fit) for a split-plot design

The standard errors of the coefficients are the square roots of the diagonal elements of the covariance matrix:
The standard error of the fitted value at a given point (used for confidence intervals) is:
The standard error that is used in the prediction intervals is:

Notation

TermDescription
subplot variance component, calculated as MSE(SP)
Xn × p design matrix for effects of factors, covariates, blocks, and the whole plot error term
the whole plot variance component, which in a balanced design has this formula:
mthe number of subplots within a whole plot
Zn × w matrix of whole plot indicators (all 1's and 0's)
nnumber of rows of data
pnumber of coefficients
wnumber of whole plots
xrow vector of predictor levels
covariance matrix of β
βvector of coefficients

Confidence interval

The range in which the estimated mean response for a given set of predictor values is expected to fall.

Formula

Notation

TermDescription
fitted response value for a given set of predictor values
α type I error rate
n number of observations
p number of model parameters
S 2(b)variance-covariance matrix of the coefficients
s 2 mean square error
X design matrix
X0 vector of given predictor values with 1 column and p rows
X'0transpose of the new vector of predictor values with 1 row and p columns

Prediction interval

The prediction interval is the range in which the fitted response for a new observation is expected to fall.

Formula

Notation

TermDescription
s(Pred)
fitted response value for a given set of predictor values
α level of significance
n number of observations
p number of model parameters
s 2 mean square error
X predictor matrix
X0 vector of given predictor values with 1 column and p rows
X'0transpose of the new vector of predictor values with 1 row and p columns