P236 Interobserver reliability of the Nancy index for ulcerative colitis: An assessment of the practicability and ease of use in a single-centre real-world setting.

Le, H.D.(1);Pflaum, T.(2);Sari, S.(1);Bretschneider, F.(1);Nikolaus, S.(1);Lassen, A.(1);Schreiber, S.(1);Aden, K.(1);Roecken, C.(2);

(1)University Medical Center Schleswig-Holstein Kiel, Department of Medicine I, Kiel, Germany;(2)University Medical Center Schleswig-Holstein Kiel, Institute of Pathology, Kiel, Germany;


Histological disease severity assessment in ulcerative colitis has become a mainstay in clinical endpoints definition (“histologic remission”) in clinical trials of ulcerative colitis (UC). Among the several scores that were developed for the microscopical assessment of disease activity, the Nancy index (NI) stands out for the least amount of work load due to the lowest number of scoring items. To which extent histologic assessment using NI is affected by interobserver reliability in the real word setting, is poorly understood. We therefore performed a single-center retrospective analysis of NI assessment in patients with ulcerative colitis.  


We retrospectively evaluated in two independent cohorts with a total of n=1085 of biopsy samples (sigmoid, rectum) taken from 547 clinically diagnosed UC patients, who underwent colonoscopy between 2007 and 2020. Cohort #1 consisted of 637 biopsies from 312 patients, Cohort #2 consisted of 448 biopsies from 235 patients. The NI of these samples were assessed by two blinded pathologists with a different amount of pathological experience. After each cohort a consensus conference was held where samples that were rated with different NI grades were re-assessed, and a consensual score was given by both observers. We evaluated interobserver reliability and differences in the amount of the several grades of the NI rated by the observers.


The interobserver-agreement of the NI was very-good after the assessment of the 1085 samples (κ = 0,687 [95%-CI: 0,653-0,720]). An improvement of the interobserver-agreement was found with growing numbers of samples evaluated by both observers (1st cohort: κ = 0,659 [95%-CI: 0,615-0,704]; 2nd cohort: κ = 0,726 [95%-CI: 0,675-0,776]). The biggest number of differences were in NI grade 1 (observer 1: n=128; observer 2: n=236). The smallest number of differences were in NI grades 0 (observer 1: n=504; observer 2: n=479) and 3 (observer 1: n=71; observer 2: n=66). After a consensual score was given the largest part of grades were NI grade 0 (n=504) followed by NI grade 2 (n=309). The least number of samples were given NI grade 3 (n=62). Average time for scoring was less than 2 minutes.


The NI represents an easy-to-use index with very high interobserver reliability to assess the histological disease activity of UC patients in a real-world setting. Though further improvements of the NI regarding stricter classifications of the several grades need to be done to improve the practicability of the index. While NI grades 0 and 3 having a very high level of agreement between the observers, NI grade 1 has a lower agreement-level. This highlights the clinical need to specify histological characteristic leading to NI grade 1.