To ensure that your results are valid, consider the following guidelines when you collect data, perform the analysis, and interpret your results.
- The response variable should be continuous
- Continuous data are measurements that may potentially take on any numeric value within a range of values along a continuous scale, including fractional or decimal values.
- The response data can be single event times or multiple event times
- To collect data, usually you measure the amount of time until an event. For a counting process form, you must also specify the time when the subject enters the study. For example, a subject has a form of skin cancer at 7, 11, 15, and 27 months of the study. Afterwards, the subject was cancer free for 6 months. At each time interval researchers record the fixed predictor of treatment group, and they record a time dependent predictor of the subject's hormone level.
- You can also use data that describes multiple events. For example, an automobile breaks down, is repaired, and returned to service, then breaks down again, and so on. The data values represent the time of each failure without considering the repair time.
- The data must be in the counting process style of input
- In the counting process input form, multiple rows represent each subject. Each row describes a time interval when the values of all the variables are constant. Time-dependent predictors change between rows. The intervals begin just after the start time and include the end time. The predictors can be fixed or time dependent. For more information, go to Enter your data for Fit Cox Model in a Counting Process Form.
- You must account for incomplete data
-
Because the response data is time-to-event, it is subject to censoring and truncation. For Cox regression models, the most common form of censoring is right-censoring, and the most common form of truncation is left-truncation. You can specify a column to indicate which response times are censored and uncensored.
- Right censored: A subject response time is right-censored if the subject does not experience the event of interest before the study ends, or if the subject is removed from the study before they experience the event. For example, a test unit might still function after the testing period, or a subject might withdraw early from a study for a reason other than death.
- Left-truncation or delay entries: Left truncation occurs when you do not observe a subject at the start of the study. Instead, you include them later in the study when an intermediate event occurs. The time when the subject enters the study is known as the entry time or truncation time. For example, you don't include patients on a waiting list for an organ transplant until an organ is available for transplant.
- Subjects on different treatments experience the event at proportional rates
- The Cox regression model does not require you to specify a parametric distribution for the response data. However, the model assumes that individuals in two different treatments have proportional hazards or risks to experience the event. The proportional hazards assumption provides a simple interpretation of the regression coefficients in terms of hazard ratios or relative risks. If the proportional hazards assumption does not hold, then the relative risks table can yield wrong conclusions. Use the tests for proportional hazards table, the Andersen plot, and the Arjas plot to verify this assumption.
- The model must be full rank
- A full rank model includes enough data to estimate all the terms in your model. Missing data, insufficient data, or high collinearity can prevent a model from being full rank. If the model is not full rank, Minitab will alert you when you perform the analysis. You can often resolve this issue by removing unimportant, higher-order interactions from the model.