Risk Prediction Modeling

Division of Preventive Medicine

Greenwood-D’Agostino-Nam test of calibration

From Demler, Paynter, Cook, Statistics in Medicine, 2015. This is a test for survival outcomes and is appropriate with censoring.

GND_test.v2.r – Includes R function:

GND.calib – for user-specified categories of risk

Here is an example of how to use this program:

Corrected Intermediate-Risk NRI

From Paynter & Cook, MDM 2013, and Paynter & Cook, BMJ, 2016. This provides a correction to the NRI applied to an intermediate risk group (conditional NRI or cNRI). It corrects for the null using estimates from all cells of the reclassification table.

Many R packages are available to compute measures of model fit. Here are some suggestions for computing reclassification statistics. These have not personally been tested for accuracy.

Hmisc – available at http://cran.r-project.org/web/packages/Hmisc/index.html

The Hmisc package contains many functions useful for data analysis, high-level graphics, utility operations, functions for computing sample size and power, importing datasets, imputing missing values, advanced table making, variable clustering, character string manipulation, conversion of R objects to LaTeX code, and recoding variables. The included improveProb function can compute the IDI and the continuous NRI for binary data.

nricens – available at http://cran.r-project.org/web/packages/nricens/index.html

Calculates the categorical or continuous NRI for both binary and survival data. Also computes a risk difference-based NRI based on whether the difference in predicted risk for the two models is above (or below) a threshold value.

survIDINRI – Available at http://cran.r-project.org/web/packages/survIDINRI/index.html

Computes the IDI and the continuous NRI for censored survival data comparing a base model to a model with additional predictors. Predicted risk is defined at a fixed time t0. Also computes the median improvement in risk score.

PredictABEL – Available at http://cran.r-project.org/web/packages/PredictABEL/index.html

Contains many functions to assess the performance of binary risk models, including the c-statistic, Hosmer-Lemeshow goodness of fit test, reclassification table, NRI, the IDI, and many plotting functions. It also includes logistic regression functions, with a focus on genetic risk factors. The included reclassification function can compute the continuous or categorical NRI, and the IDI for binary data.

Several packages are also available specifically to plot ROC curves and/or estimate the area under the curve. These include (but are not limited to) the following:

ROCR – Available at http://cran.r-project.org/web/packages/ROCR/index.html

Includes several plots for the ROC curve, accuracy, and calibration, and computes several metrics based on these.

cvAUC – Available at http://cran.r-project.org/web/packages/cvAUC/index.html

Computes cross-validated AUC and confidence intervals using the ROCR package. Handles either independent or pooled repeated measures data. The method uses influence curves to be more efficient computationally than bootstrapping.

survivalROC – Available at http://cran.r-project.org/web/packages/survivalROC/index.html

Time-dependent ROC curve estimation from censored survival data.

Brigham & Women’s Hospital

Division of Preventive Medicine | 900 Commonwealth Avenue East | Boston, MA 02215 | 617.278.0796

email: ncook@bwh.harvard.edu