Skip to main content

Research Repository

Advanced Search

A note on estimating the Cox-Snell R2 from a reported C statistic (AUROC) to inform sample size calculations for developing a prediction model with a binary outcome.

A note on estimating the Cox-Snell R2 from a reported C statistic (AUROC) to inform sample size calculations for developing a prediction model with a binary outcome. Thumbnail


Abstract

In 2019 we published a pair of articles in Statistics in Medicine that describe how to calculate the minimum sample size for developing a multivariable prediction model with a continuous outcome, or with a binary or time-to-event outcome. As for any sample size calculation, the approach requires the user to specify anticipated values for key parameters. In particular, for a prediction model with a binary outcome, the outcome proportion and a conservative estimate for the overall fit of the developed model as measured by the Cox-Snell R2 (proportion of variance explained) must be specified. This proposal raises the question of how to identify a plausible value for R2 in advance of model development. Our articles suggest researchers should identify R2 from closely related models already published in their field. In this letter, we present details on how to derive R2 using the reported C statistic (AUROC) for such existing prediction models with a binary outcome. The C statistic is commonly reported, and so our approach allows researchers to obtain R2 for subsequent sample size calculations for new models. Stata and R code is provided, and a small simulation study.

Acceptance Date Oct 23, 2020
Publication Date Dec 7, 2020
Publicly Available Date Mar 29, 2024
Journal Statistics in Medicine
Print ISSN 0277-6715
Publisher Wiley
DOI https://doi.org/10.1002/sim.8806
Keywords C statistic (AUROC), R squared, clinical prediction model, sample size
Publisher URL https://doi.org/10.1002/sim.8806

Files




Downloadable Citations