P308 Interobserver agreement in assessment of Rutgeerts' score of endoscopic recurrence of ileal Crohn's disease: a substudy of the TOPPIC trial
N.A. Kennedy*1, H. Ennis2, D.R. Gaya3, C. Mowat4, I.D.R. Arnott1, on behalf of TOPPIC Trial Investigators1, J. Satsangi1
1Western General Hospital, Gastrointestinal Unit, Edinburgh, United Kingdom, 2University of Edinburgh, Edinburgh Clinical Trials Unit, Edinburgh, United Kingdom, 3Glasgow Royal Infirmary, Gastroenterology, Glasgow, United Kingdom, 4Ninewells Hospital & Medical School, Gastroenterology, Dundee, United Kingdom
Rutgeerts' score is widely used for the assessment of endoscopic recurrence following ileocaecal (IC) resection for Crohn's disease (CD). Higher scores have been shown to be associated with an increased risk of clinical recurrence. TOPPIC is a double-blind randomised, placebo-controlled trial of mercaptopurine for the prevention of post-operative recurrence after IC resection for CD and includes a secondary endpoint of endoscopic recurrence. Few published data are available on interobserver agreement of Rutgeerts' score. This study aimed to assess the interobserver agreement of Rutgeerts' score on images from endoscopies carried out as part of the TOPPIC trial.
Five TOPPIC trial investigators were shown endoscopic images taken from 43 colonoscopies performed in Edinburgh as part of the TOPPIC trial. The investigators were blinded to the original report and were shown only the images with a description of the anatomical location from which each image was taken. Each investigator was independently asked to score each colonoscopy using the Rutgeerts' score and a custom-designed application. Statistical analysis was performed using R and the psych package. The five scores for each colonoscopy were compared with each other and the score made by the original endoscopist. Shrout and Fleiss' intraclass correlation and pairwise weighted Cohen's kappa were calculated.
The original scores for the colonoscopies were spread across the possible scores, with 11 i0, 10 i1, 7 i2, 10 i3 and 5 i4. Intraclass correlation for single ratings (ICC3) was 0.82 (95% confidence interval 0.74-0.88). The weighted Cohen's kappa when assessed for each possible pair of scorers ranged from 0.72 to 0.88. A graphical representation of the agreement between the original score and the rescores is shown in figure 1.
When scores were stratified into endoscopic recurrence or not, as defined by a score of i2 or greater, all five scorers agreed with the original score in 34/43 (79%). There was no significant difference in this agreement between those procedures with an original score ≥i2 or <i2.
“Figure 1 Comparison of original Rutgeerts’ scores with those when rescored. The number of scopes is indicated by the size of the circle. Lines indicate endoscopic recurrence (>=i2) vs no recurrence”
Interobserver agreement for Rutgeerts' score of endoscopic recurrence was generally good in this cohort of patients. Despite the limitation of using still images, agreement with the score of the original endoscopistwas also good. However, there was some variation in assessment, even when assessing the presence/absence of endoscopic recurrence. These findings are important when considering the reliability of outcome data in multicentre clinical trials.