Data considerations for Regression with Life Data

To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.

The response variable should be continuous
Continuous data are measurements that may potentially take on any numeric value within a range of values along a continuous scale, including fractional or decimal values. If your response data are binary (only two possible outcomes), instead of continuous measurements of failure time (or other units), use Probit Analysis.
The response data are often failure times
To collect data, usually you measure the amount of time until an item fails when it is subject to different conditions that are measured by one or more variables and/or factors. For example, you might measure the time until failure for an item running at different temperatures.
The failure times must be independent
The failure time for one item should not influence the failure time of another item. If the failure times are dependent, the results may not be accurate. For example, times between failures for a repairable system are often not independent.
You must account for censored data

Life data are often censored, which means that the exact failure times of some items are unknown. If you have censored observations, you must include them in your analysis to obtain accurate reliability estimates.

Use right-censoring to credit success time to items that have not yet failed. Use interval- or left-censoring to account for uncertainty when you don’t know the exact failure times. For more information, go to Data censoring.

The model can include up to 9 factors and 50 covariates
Predictors may be factors (categorical variables) or covariates (continuous variables). Unless you specify a predictor as a factor, the predictor is assumed to be a covariate.
The model terms may be created from the predictor variables and treated as factors, covariates, interactions, or nested terms. Factors may be crossed or nested. Covariates may be crossed with each other or with factors, or nested within factors.
The model should fit the data adequately
To obtain accurate results, the model assumptions, including the distribution fit and equal shape (Weibull and exponential) or scale parameters (other distributions) should be appropriate for your data. Use engineering or historical knowledge to select a distribution model. Then examine the probability plots for the standardized and Cox-Snell residuals to determine whether the assumptions of the model are appropriate.
The model must be full rank and hierarchical
In a hierarchical model, if an interaction term is included, all lower order interactions and predictors that compose the interaction term must also be in the model. A full rank model includes enough data to estimate all the terms in your model. Missing data, insufficient data, or high collinearity can prevent a model from being full rank. If the model is not full rank, Minitab will alert you when you perform the analysis. You can often resolve this issue by removing unimportant, higher-order interactions from the model. For more information, go to Restrictions on models for regression with life data.