Statistical Issues in survival analysis (td pred accuracy)

Statistical issues in survival analysis (td predictive accuracy)

January 14, 2026 To evaluate the performance of a prediction model in time-to-event outcomes with censoring is very difficult. Interval censoring and competing risks present additional challenges. They proposed two methods to deal with interval censoring: a model-based approach and inverse probability of censoring weighting (IPCW) approach, focusing on 3 key time-dependent metrics: rea under the receiver operating characteristic curve, Brier score, and expected predictive crossentropy. They defined the progression-specific sensitivity for a threshold value, c, as the probability that the predicted risk is larger or equal to c for patients with cancer progression in the interval of interest, [t,t+Δt]. For the specificity, used to evaluate the ability to identify the negative instances, it is defined as the probability that the predicted progression-specific risk for patients who “survive” the interval of interest is lower than the threshold, c. Since in the presence of interval censoring, however, it is often not clear whether a patient is a case, a control, or even had the event before the interval of interest, they, therefore, proposed two methods to deal with this uncertainty when estimating the progression-specific sensitivity and specificity: model-based approach and IPCW. In the model-based approach, all subjects at risk at time t are considered when calculating the sensitivity but they contribute with different weights which depend on their estimated probability of experiencing cancer progression during the interval of interest for the patient i in the test set. The weights themselves are derived from the cumulative incidence function estimated by the model. In the IPCW approach, one can utilize only the subset of patients for whom the event is known to be at the interval of interest, that is, the absolute cases and weigh them to also represent the patients who were censored before experiencing the primary event of interest. These weights are the inverse of the probability of not being censored before time t, obtained using the KM estimator. The Brier score was also calculated and it is a metric that combines both discrimination and calibration by quantifying how close the predicted probabilities are to the actual binary outcomes. A lower score indicates better model performance. This score can be calculated by

Turn static files into dynamic content formats.

Create a flipbook