S is the estimated standard deviation of the error term. The lower the value of S, the better the conditional fitted equation describes the response at the selected factor settings. However, an S value by itself doesn't completely describe model adequacy. Also examine the key results from other tables and the residual plots.
R^{2} is the percentage of variation in the response that is explained by the model. It is calculated as 1 minus the ratio of the error sum of squares (which is the variation that is not explained by model) to the total sum of squares (which is the total variation in the model).
Use R^{2} to determine how well the model fits your data. The higher the R^{2} value, the more variation in the response values is explained by the model. R^{2} is always between 0% and 100%.
Assuming the models have the same covariance structure, R^{2} increases when you add additional fixed factors or covariates. Therefore, R^{2} is most useful when you compare models of the same size.
Small samples do not provide a precise estimate of the strength of the relationship between the response and predictors. For example, if you need R^{2} to be more precise, you should use a larger sample (typically, 40 or more).
Goodness-of-fit statistics are just one measure of how well the model fits the data. Even when a model has a desirable value, you should check the residual plots to verify that the model meets the model assumptions.
Use adjusted R^{2} when you want to compare models with the same covariance structure but have a different number of fixed factors and covariates. Assuming the models have the same covariance structure, R^{2} increases when you add additional fixed factors or covariates. The adjusted R^{2} value incorporates the number of fixed factors and covariates in the model to help you choose the correct model.
The corrected Akaike’s Information Criterion (AICc) and the Bayesian Information Criterion (BIC) are measures of the relative quality of a model that account for fit and the number of terms in the model.
Use AICc and BIC to compare different models. Smaller values are desirable. However, the model with the least value for a set of predictors does not necessarily fit the data well. Also use tests and residual plots to assess how well the model fits the data.
Both AICc and BIC assess the likelihood of the model and then apply a penalty for adding terms to the model. The penalty reduces the tendency to overfit the model to the sample data. This reduction can yield a model that performs better in general.
As a general guideline, when the number of parameters is small relative to the sample size, BIC has a larger penalty for the addition of each parameter than AICc. In these cases, the model that minimizes BIC tends to be smaller than the model that minimizes AICc.
In some common cases, such as screening designs, the number of parameters is usually large relative to the sample size. In these cases, the model that minimizes AICc tends to be smaller than the model that minimizes BIC. For example, for a 13-run definitive screening design, the model that minimizes AICc will tend to be smaller than the model that minimizes BIC among the set of models with 6 or more parameters.
For more information on AICc and BIC, see Burnham and Anderson.^{1}