P245 Inter-rater reliability of gastrointestinal ultrasound in the assessment of disease activity in patients with inflammatory bowel disease prior to commencing medical therapy

R. Smith1,2, K. Taylor1,2, A. Friedman1,2, H. Su1, D. Con3, P. Gibson1,2

1Department of Gastroenterology, Alfred Health, Melbourne, Australia, 2Department of Gastroenterology, Monash University, Melbourne, Australia, 3Department of General Medicine, Eastern Health, Melbourne, Australia


Gastrointestinal ultrasound (GIUS) is an emerging modality in Australia for the assessment of disease activity in patients with inflammatory bowel disease (IBD). Its utility relies upon reproducibility of key indices, particularly when performed by different operators.


The aim of this article was to address the inter-rater reliability among GIUS-credentialed gastroenterologists in Australia, in their assessment of GIUS indices reflecting disease activity in patients with IBD.


Patients with IBD were prospectively recruited for paired, consecutive, blinded GIUS assessment at the commencement of a new medical therapy. GIUS was performed by two of four gastroenterologists accredited in GIUS at our centre. GIUS assessment was completed of the known disease distribution. Bowel wall thickness (BWT) was determined from a mean of four measurements for each segment of bowel examined. Colonic and small bowel BWT was defined as normal <3 mm, mild disease 3.1–6 mm, moderate disease 6.1–9 mm and severe disease >9.1 mm. Anastomotic BWT measures were considered as normal <5 mm or abnormal >5.1 mm. Colour Doppler assessment of the bowel wall was based on the Limberg scale1: no colour Doppler signal, mild, moderate and severe. The presence or absence of mesenteric hyperechogenicity surrounding each bowel segment and the presence or absence of reactive lymphadenopathy was assessed. The inter-rater reliability was tested with the Fleiss kappa.


26 patients were assessed. Sixty-nine per cent had a diagnosis of Crohn’s disease, 19% with ulcerative colitis and 12% with IBD unclassified. A total of 80 bowel segments were assessed. In the assessment of BWT, the observed agreement was 77.5% and the inter-rater reliability was in the substantial range with a Fleiss’ kappa of 0.63 (95% CI 0.48–0.78, p < 0.001). Of note, the percentage of BWT measures that differed by <1 mm, between each operator, was 79%. When assessing the degree of Doppler activity, the observed agreement was 82.5% and again inter-rater reliability in the substantial range with a Fleiss’ kappa of 0.68 (95% CI 0.54–0.83, p < 0.001). The assessment of mesenteric hyperechogenicity showed an observed agreement of 96.3%, with a near-perfect inter-rater agreement with a Fleiss’ kappa of 0.89 (95% CI 0.76–0.99, p < 0.001). For the presence or absence of lymphadenopathy, the observed agreement between users was 84.6% with a substantial inter-rater agreement reflected by a Fleiss’ kappa of 0.64 (95% CI 0.29–0.98, p < 0.001).

Among Gastroenterologists experienced and credentialed in GIUS, the inter-rater reliability of markers of intestinal inflammation was substantial, providing confidence in the reproducibility of findings between different operators.