Bone marrow data

Medical researchers want to determine the success rate of recovery from a bone marrow transplant as a treatment for acute leukemia. Recovery depends on factors such as the patient's Risk Category at the time of transplantation, their Disease Stage, and whether their platelet count returned to normal levels. Risk Category and Disease Stage are fixed predictors because they do not change throughout the study. However, a patient's platelet count is a time-dependent predictor because the count can change during the recovery process.

The medical researchers study 137 patients after they receive the transplant and record the number of days they are disease-free. A patient is not disease-free if they die before their platelet count returns to normal or if their leukemia returns after their platelet count returns to normal. A value of Yes indicates a disease-free patient and is a censored observation. A censored observation is when the event does not occur by the end of the observation time.

The data are in the counting process form, which means that multiple rows represent each patient. Each row describes a time interval when the values of all the variables are constant. Time-dependent predictors change between rows. The intervals begin just after the start time and include the end time.

For example, the following table contains the data for the patient with an ID of 1. The observed values of Risk Category and Disease Stage are the same in every row because those predictors are fixed. Because a normal platelet count can change during the study, each patient requires a new row of data whenever this predictor changes. The first row shows that the patient did not have a normal platelet count in the interval for the first 13 days after the transplant. The second row shows that the patient had a normal platelet count from after day 13 until the end of the study on day 2,081.

ID Risk Category Start Time End Time Disease Free Normal Platelets Disease Stage
1 1 0 13 Yes No Normal
1 1 13 2081 Yes Yes Normal

You can use this data to demonstrate Fit Cox Model in a Counting Process Form.

Worksheet column Description
ID Indicates the patient
Risk Category The risk category of the patient at the time of transplantation
Start Time The starting day
End Time The ending day
Disease Free Whether the patient is disease-free
Normal Platelets Whether the patient has a normal platelet count
Disease Stage The patient's French-American-British classification


These data were adapted based on a public data set from Copelan that is in Klein and Moeschberger (2003)1.

1 Klein, J.P. & Moeschberger, M.L. (2003). Semiparametric proportional hazards regression with fixed covariates. Survival Analysis: Techniques for Censored and Truncated Data (2nd, ed. pp. 243-293). Springer.