Search in the Abstract Database

Abstracts Search 2019

P001 Multi-omic data integration assisted identification of molecular features contributing to disease heterogeneity in Crohn's disease

P. Sudhakar*1,2,3, B. Verstockt1,4, B. Creyns5, J. Cremer5, G. van Assche1,4, T. Korcsmaros2,3, M. Ferrante1,4, S. Vermeire1,4

1KU Leuven Department of Chronic Diseases, Metabolism and Ageing, Translational Research Center for Gastrointestinal Disorders (TARGID), Leuven, Belgium, 2Earlham Institute, Norwich, UK, 3Quadram Institute, Norwich, UK, 4University Hospitals Leuven, Department of Gastroenterology and Hepatology, KU Leuven, Leuven, Belgium, 5KU Leuven Department of Microbiology and Immunology, Laboratory of Clinical Immunology, Leuven, Belgium


The disease behaviour of Crohn's disease is heterogeneous as evidenced by inflammatory, fibrostenotic or penetrating sub-types. Biomarkers that predict these sub-types at diagnosis, and biological mechanisms explaining the difference between them are lacking. Dysregulated CD4+ cell populations in CD patients have been associated with disease activity variation. We aim to identify discriminative features, from the integrative analysis of gene expression from blood-derived, sorted PBMC (CD4+ monocytes and CD14+ T cells) and genetic risk burden, which explain CD behavioural heterogeneity.


Sorted populations of circulating CD14+ and CD4+ cells were isolated from the blood of 29 patients with active CD (35% male; median [IQR] disease duration 21.5 [14.0–27.3] years; 24% inflammatory (B1), 48% stenosing (B2) and 28% penetrating disease (B3)). RNA was extracted from the CD14+/CD4+ cells and sequenced. The genetic risk burden was calculated for known CD GWAS variants using Immunochip genotyping data. We integrated the three above-described -omic data using Multi-Omics Factor Analysis (MOFA). Features were selected from the strongest -omic layers of the explanatory Latent Factors (LFs). To obtain the strongest features, we further selected the top 20% using the multivariate filter RRelief.


Nine Latent Factors (LFs) were identified to contribute at least 2% of the total variance. One of the nine LFs explained disease behaviour (r = 0.45, p = 0.01). Clustering of the samples along the explanatory LF achieved meaningful separation of the samples as evidenced by the enrichment of sub-types in the clusters. We identified gene expression of CD4+ cells as the strongest -omic layer in the explanatory LF. Post feature extraction and selection, we identified a panel of 86 genes expressed in CD4+ cells distinguishing the three sub-types. The RRelief selected top 20% gene-set was enriched with immune cell and interleukin signalling in addition to particular genes encoding HLA antigens and those related to chaperones.


Using multi-omic data integration, we identified gene expression signatures from CD4+ T cells which could explain CD subtypes. Even though HLA loci has been linked to CD susceptibility and CD described as a ‘chaperonopathy’, we present the novel finding that the expression of distinct HLA genes and those associated with chaperones in CD4+ cells could be used as potential biomarkers to distinguish CD subtypes. It can lead to surrogate biomarkers in whole blood without the need for additional sample processing. Verification using newly diagnosed cohorts can validate our findings and predict disease trajectories as well as formulate personalised therapies.