Archer, L, Snell, KIE, Ensor, J, Hudda, MT, Collins, GS and Riley, RD (2020) Minimum sample size for external validation of a clinical prediction model with a continuous outcome. Statistics in Medicine. ISSN 0277-6715 (Submitted)

There is a more recent version of this item available.
[thumbnail of SSForValidationLinearModels_UPDATED_Final_Clean.docx] Text
SSForValidationLinearModels_UPDATED_Final_Clean.docx - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (432kB)


Clinical prediction models provide individualised outcome predictions to inform patient counselling and clinical decision making. External validation is the process of examining a prediction model’s performance in data independent to that used for model development. Current external validation studies often suffer from small sample sizes, and subsequently imprecise estimates of a model’s predictive performance. To address this, we propose how to determine the minimum sample size needed for external validation of a clinical prediction model with a continuous outcome. Four criteria are proposed, that target precise estimates of (i) R_^2 (the proportion of variance explained), (ii) calibration-in-the-large (agreement between predicted and observed outcome values on average), (iii) calibration slope (agreement between predicted and observed values across the range of predicted values), and (iv) the variance of observed outcome values. Closed-form sample size solutions are derived for each criterion, which require the user to specify anticipated values of the model’s performance (in particular R_^2) and the outcome variance in the external validation dataset. A sensible starting point is to base values on those for the model development study, as obtained from the publication or study authors. The largest sample size required to meet all four criteria is the recommended minimum sample size needed in the external validation dataset. The calculations can also be applied to estimate expected precision when an existing dataset with a fixed sample size is available, to help gauge if it is adequate. We illustrate the proposed methods on a case-study predicting fat-free mass in children.

Item Type: Article
Additional Information: "This is the peer reviewed version of the following article: [FULL CITE], which has been published in final form at [Link to final article using the DOI]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions."
Subjects: R Medicine > R Medicine (General)
R Medicine > RA Public aspects of medicine
R Medicine > RA Public aspects of medicine > RA0421 Public health. Hygiene. Preventive Medicine
Divisions: Faculty of Medicine and Health Sciences > School of Primary, Community and Social Care
Depositing User: Symplectic
Date Deposited: 13 Oct 2020 14:43
Last Modified: 11 Sep 2021 01:30

Available Versions of this Item

Actions (login required)

View Item
View Item