Yeates, P, Moult, A, Cope, N, Mccray, G, Cilas, E, Lovelock, T, Vaughan, N, Daw, D, Fuller, R and McKinley, R (2021) Measuring the impact of examiner variability in a multiple-circuit Objective Structured Clinical Exam (OSCE). Academic Medicine, 96 (8). pp. 1189-1196. ISSN 1938-808X

[thumbnail of Accepted manuscript VESCA 3.docx] Text
Accepted manuscript VESCA 3.docx - Accepted Version

Download (444kB)


Purpose: Ensuring examiners in different parallel circuits of Objective Structured Clinical Exams (OSCEs) collectively judge to the same standard is critical to the chain of validity. Recent work suggested that the examiner-cohort (i.e. the particular group of examiners) a candidate meets could significantly alter outcomes for some candidates. Despite this, examiner-cohort effects are rarely examined as fully-nested data (i.e. no cross-over between the students judged by different groups of examiners) limit comparisons. This study aimed to replicate and further develop a novel method called Video-based Examiner Score Comparison and Adjustment (VESCA), so that it can be used to enhance quality assurance of distributed or national OSCEs. Method: Six volunteer students were filmed on all12 stations in a summative OSCE. In addition to examining live performances, examiners from all 8 separate examiner-cohorts collectively scored the same pool of video performances. Examiners scored videos specific to their station. Video scores provided linkage within otherwise fully-nested data, enabling comparisons by Many Facet Rasch Modelling. Analysis primarily compared and adjusted for examiner-cohort effects. Additional analyses compared examiners’ scoring when videos were embedded (interspersed between live candidates within the OSCE) or judged later via the internet. Results: Having accounted for differences in students’ ability, different examiner-cohorts scores for the same ability of student ranged from 18.57(68.7%) to 20.49(75.9%), Cohen’s d=1.3. Score adjustment changed up to 16% of students’ classification (pass-fail/ fail-pass) depending on modelled cut score. Internet and embedded video scoring showed no difference in mean scores or variability. Examiners’ accuracy did not deteriorate over the 3 week scoring period allowed for internet-based scoring. Conclusions: Examiner-cohorts produced a replicable significant influence on OSCE scores which was unaccounted for by typical assessment psychometrics. VESCA offers a promising means to enhance validity and fairness in distributed OSCEs or national exams, whilst internet-based scoring may enhance VESCA’s feasibility.

Item Type: Article
Additional Information: This is the accepted author manuscript (AAM). The final published version (version of record) is available online via Lippincott, Williams & Wilkins at Please refer to any applicable terms of use of the publisher.
Subjects: R Medicine > R Medicine (General) > R735 Medical education. Medical schools. Research
Divisions: Faculty of Medicine and Health Sciences > School of Medicine
Depositing User: Symplectic
Date Deposited: 12 Oct 2020 16:23
Last Modified: 23 Nov 2021 10:24

Actions (login required)

View Item
View Item