Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

Andaur Navarro, Constanza L; Damen, Johanna A A G; Takada, Toshihiko; Nijman, Steven W J; Dhiman, Paula; Ma, Jie; Collins, Gary S; Bajpai, Ram; Riley, Richard D; Moons, Karel GM; Hooft, Lotty

doi:10.1136/bmjopen-2020-038832

Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques

Andaur Navarro, Constanza L; Damen, Johanna A A G; Takada, Toshihiko; Nijman, Steven W J; Dhiman, Paula; Ma, Jie; Collins, Gary S; Bajpai, Ram; Riley, Richard D; Moons, Karel GM; Hooft, Lotty

Authors

Constanza L Andaur Navarro

Johanna A A G Damen

Toshihiko Takada

Steven W J Nijman

Paula Dhiman

Jie Ma

Gary S Collins

Dr Ram Bajpai r.bajpai@keele.ac.uk

Richard D Riley

Karel GM Moons

Lotty Hooft

Abstract

INTRODUCTION: Studies addressing the development and/or validation of diagnostic and prognostic prediction models are abundant in most clinical domains. Systematic reviews have shown that the methodological and reporting quality of prediction model studies is suboptimal. Due to the increasing availability of larger, routinely collected and complex medical data, and the rising application of Artificial Intelligence (AI) or machine learning (ML) techniques, the number of prediction model studies is expected to increase even further. Prediction models developed using AI or ML techniques are often labelled as a 'black box' and little is known about their methodological and reporting quality. Therefore, this comprehensive systematic review aims to evaluate the reporting quality, the methodological conduct, and the risk of bias of prediction model studies that applied ML techniques for model development and/or validation. METHODS AND ANALYSIS: A search will be performed in PubMed to identify studies developing and/or validating prediction models using any ML methodology and across all medical fields. Studies will be included if they were published between January 2018 and December 2019, predict patient-related outcomes, use any study design or data source, and available in English. Screening of search results and data extraction from included articles will be performed by two independent reviewers. The primary outcomes of this systematic review are: (1) the adherence of ML-based prediction model studies to the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis (TRIPOD), and (2) the risk of bias in such studies as assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). A narrative synthesis will be conducted for all included studies. Findings will be stratified by study type, medical field and prevalent ML methods, and will inform necessary extensions or updates of TRIPOD and PROBAST to better address prediction model studies that used AI or ML techniques. ETHICS AND DISSEMINATION: Ethical approval is not required for this study because only available published data will be analysed. Findings will be disseminated through peer-reviewed publications and scientific conferences. SYSTEMATIC REVIEW REGISTRATION: PROSPERO, CRD42019161764.

Journal Article Type	Article
Acceptance Date	Nov 8, 2020
Online Publication Date	Nov 11, 2020
Publication Date	Nov 11, 2020
Publicly Available Date	May 26, 2023
Journal	BMJ Open
Publisher	BMJ Publishing Group
Peer Reviewed	Peer Reviewed
Volume	10
Issue	11
Article Number	e038832
DOI	https://doi.org/10.1136/bmjopen-2020-038832
Keywords	epidemiology, preventive medicine, statistics & research methods
Publisher URL	https://bmjopen.bmj.com/content/10/11/e038832