Skip to Content

Leveraging Big-data Analytics to Understand Inflammatory Bowel Disease

June 4, 2024

By: David Binion, MD, professor of Medicine, Clinical and Translational Science

 

Co-Director, IBD Center
Director, Nutrition Support Service
Medical Director, Intestinal Rehabilitation and Transplant Center
UPMC Presbyterian Hospital

 

Faced with the challenge of making precision medicine an effective reality for patients with inflammatory bowel disease (IBD), David Binion, MD, and his team leveraged the UPMC electronic medical record and big-data analytic techniques to create a metadata platform for scientific discovery. They are now performing cutting-edge research using this resource to understand the clinical features of IBD, better target treatment recommendations, and expand prognostic capabilities for patients with IBD.

 

Inflammatory bowel disease (IBD) is a chronic, lifelong inflammatory disorder of the gastrointestinal tract. Care for patients with IBD is multifaceted and may include preventative measures comprised predominantly of medical therapeutics (both for maintenance and to treat acute flares and accompanying pain), dietary modifications, and surgical treatments. While IBD is commonly divided into two disorders — Crohn’s disease and ulcerative colitis — this classification is an oversimplification of a heterogeneous disease. IBD is a complex syndrome, and over 200 genetic polymorphisms associated with the development of IBD have been identified. To effectively treat IBD, we need to implement strategies informed by each patient’s genetics, symptoms, and risk profile. When we understand the complexity of IBD, we can begin to realize the dream of precision medicine.

Establishing the UPMC IBD Natural History Registry and a Metadata Platform

UPMC’s electronic medical record (EMR) presented a fantastic opportunity to better understand and characterize the heterogeneity of IBD. The UPMC EMR is one of the largest repositories of comprehensive medical data in the country and has been in use longer than many other EMRs. We set out to develop tools using UPMC’s vast EMR and big-data analytic tools informed by our ability to track outcomes and patient trajectories over time. Our goal was, and remains, to gain an essential understanding of each of the many biologic endotypes of IBD in order to reverse engineer treatments that will drastically improve the quality of life (QoL) for individuals with IBD.

We have accrued over 5,000 patients in the UPMC IBD Natural History Registry, a prospective, longitudinal registry (Clinicaltrials.gov NCT 04243525), by recruiting most patients who receive care at the UPMC Center for Inflammatory Bowel Disease for registry participation.1 When patients consent to registry participation, we obtain not only current clinical data and clinical information, but also medical data from their EMR from 2009 to the present. Moreover, the database builds off each prior project and curates and transforms observational clinical information from the EMR continuously.

Potentially useful metrics are built into clinic encounters that are then documented in the EMR. Additionally, for patients treated at the UPMC Center for Inflammatory Bowel Disease, the EMR contains surveys completed in the clinic, including quantitative pain assessments, dietary surveys, QoL surveys, and surveys assessing anxiety, depression, and other psychosocial measures. Health care charges have also been included in the registry database and are used as a markers of disease severity. Using the EMR as the data source for registry studies reduces the burden of participation for both the patient and their physician and promotes patient retention.

Starting with the Epic and Cerner EMR systems, we learned how to pull patient data, how to extract information from that data, and how to store it securely. We developed a simple platform with programming in Microsoft Excel. In collaboration with Claudia Ramos Rivers, MD, a research analytic scientist at the University of Pittsburgh School of Medicine, and with help from faculty affiliated with the University of Pittsburgh School of Computing and Information and School of Public Health, we tackled each challenge encountered to create a flexible metadata platform to handle data from the EMR responsibly and safely (Figure 1). The platform also uses natural language processing to extract health information that is not entered into the EMR in a standardized format. Examples include patient discharge summaries, surgical notes, pathology reports, and clinic notes.

Figure 1. Flow diagram of the UPMC inflammatory bowel disease research registry. IBD, inflammatory bowel disease; H&P, history and physical. (From Anderson AJ, et al. Dig Dis Sci. 2016;61(11):3236-3245.) 

Figure 3 

In developing resources to study IBD, we have taken a different approach to tissue and blood banking than that commonly used. We have cataloged every pathological specimen obtained during care and now have approximately 20,000 colonoscopy samples for easy study access. Additionally, we have well over 3 million time-stamped laboratory results from patients to capture the natural history of IBD. This totals over 3 terabytes of clinical data.

Identifying IBD Phenotypes

The IBD research registry offers a unique opportunity to investigate clinical research questions regarding the natural course of the disease, phenotype associations, effectiveness of treatments, and quality of care. Tracking the natural history of IBD using big data and this meta-platform have provided insight into subsets of IBD patients who likely have different disease endotypes, potentially enabling benefits from different treatments (Figure 2).

Figure 2. Toward precision medicine. A. Variability in patients with IBD remains underappreciated. B. Our goal is to leverage what we learn about individuals to make personalized decisions, such as the correct drug and dose, to improve the health of the overall population.

Figure 2Figure 3

For example, to identify IBD subtypes using the meta-platform, we examined multiyear patterns of health care charges. IBD phenotypes that necessitate consistently high or consistently low levels of health care utilization were identified, taking multiple utilization measures (office visits, telephone calls, hospitalizations, ER visits, radiological procedures, endoscopic procedures, and surgeries) into account.2

Biomarkers of IBD Severity

Biomarkers of IBD severity were another early research priority for this registry. We sought to identify markers obtained from laboratory data or pathology reports that would identify which individuals were at higher risk for health difficulties due to IBD as well as more rapid presentation of the health challenges associated with IBD. Peripheral blood eosinophilia (PBE) and elevated monocyte count are two biomarkers of IBD severity identified from routinely available lab results.

PBE is a defining feature of the type-2 immune response, is a well-characterized marker of asthma severity, and had been linked to IBD severity.3 Using the IBD registry, we demonstrated that laboratory findings of PBE are more common in patients with pediatric-onset IBD than in patients with adult-onset IBD. In adult patients with pediatric-onset IBD, PBE was associated with higher health care utilization, so the presence of PBE further stratified a subset of patients already at risk for severe IBD.4 This study provides insight into medications that may better support IBD management in patients with this profile.

Similarly, using the registry, we found that approximately one-third of patients with IBD had an elevated monocyte count over a six-year observation period. Patients with IBD accompanied by monocytosis had an increased likelihood of hospitalization, IBD-related surgery, or emergency department use. Monocytosis was also predictive of an earlier need for this advanced care.5

The IBD registry offers a better understanding of which patients may benefit from more aggressive surgical and surveillance approaches. When patients with Crohn’s disease undergo ileocecal resection for strictures caused by cumulative damage from IBD, epithelioid granuloma (a type of immunological scar) is found in 20%-30% of resected specimens in routine pathological analysis. When the registry was utilized to examine the prognostic value of granuloma detection, the presence of a granuloma was associated with higher rates of steroid and narcotic use, higher health care utilization, and the need for repeat surgery during a six-year observation period. In fact, this histologic data was a better predictor of the need for further surgical intervention over an 11-year observation period than colonoscopy findings after surgery.6,7

Toward Individualized Treatment of IBD

Treatment of IBD cannot be a one-size-fits-all endeavor. We currently have tools and medications that may change the clinical trajectory of some patients, but we need to know which patients will benefit, because stronger and more aggressive treatments often have side effects or clinical trade-offs affecting quality of life (Figure 2).

Studies leveraging the registry may also improve recognition of which patients with IBD are most likely to benefit from dietary modifications. Recently, we studied the influence of the consumption of sugar-sweetened beverages, such as non-diet soda, juice drinks, sports drinks, and energy drinks, on the long-term clinical trajectory of patients with IBD. High consumption of sugar-sweetened beverages was linked with increased time to hospitalization and quicker time to a first emergency department visit in patients without active disease at the time of study enrollment. High consumption of sugar-sweetened beverages was also associated with elevated markers of disease severity (anemia, monocytosis, and eosinophilia) and elevated inflammatory biomarkers.8 Additionally, the metadata platform has been used to identify subsets of patients with IBD, those with positive celiac serology without histopathologic findings of celiac disease. These patients may benefit from a gluten-free diet, since a gluten-free diet was correlated with a reduction in the burden of inflammation in these patients.9

Many potential therapies for IBD have shown only modest effects in clinical trials, with <40% of participants responding to treatment in many trials. We believe this can be attributed to the heterogeneity of patients with IBD and the existence of different disease endotypes. This metadata platform will greatly facilitate the development of treatments for patients with IBD, because it can complement ongoing and completed clinical trials and inform future trials leading to clinical breakthroughs. By identifying the biomarkers that define IBD disease subtypes, we may be able to revisit the results of completed clinical trials to examine if the percentage of respondents to therapies with a particular IBD subtype is higher than that assessed in the whole study cohort. Identifying IBD endotypes can also inform new clinical trial design, such that different subgroups of patients with IBD are targeted.

Moreover, when evaluating response to therapy, patients with IBD need to be tracked for three to five years. Clinical trials often do not include this length of follow up, as a one-year follow-up is typically required by the United States Food and Drug Administration (FDA). As a result, post-FDA-approval studies of therapies are needed to truly evaluate the efficacy of many therapies in patients with IBD. Our meta-platform should greatly facilitate this type of analysis, which is crucial to the management of a lifelong disease.

The longevity of the UPMC EMR was not the only benefit of establishing this registry within the UPMC health care system. The registry contains participants from the entire UPMC network — 40 hospitals and >600 outpatient clinics — and is supported by robust computer informatics expertise at both UPMC and the University of Pittsburgh. Thus, the UPMC network allows us to assess differences in care and clinical trajectories between the tertiary referral and community settings.

The Promise of Precision Medicine for IBD

Establishing the UPMC Natural History Registry and the ability to extract and securely store its metadata was a foundational step to understand the complexity of IBD. We have established the infrastructure necessary to examine the disease course of different IBD endotypes and the effectiveness of treatments in a real-world setting. We hope to soon reap the benefits of this investment of time, money, and effort. Through the application of big-data analytics and the clarification of the natural course of IBD subtypes, we are on the brink of implementing precision medicine for IBD.

References / Recommended Reading

  1. Anderson AJ, Click B, Ramos-Rivers C, et al. Development of an Inflammatory Bowel Disease Research Registry Derived from Observational Electronic Health Record Data for Comprehensive Clinical Phenotyping. Dig Dis Sci. Nov 2016;61(11):3236-3245.
  2. Jiang J, Click B, Anderson AM, et al. Group-Based Trajectory Modeling of Healthcare Financial Charges in Inflammatory Bowel Disease: A Comprehensive Phenotype. Clin Transl Gastroenterol. Jul 14 2016;7(7):e181.
  3. Click B, Anderson AM, Koutroubakis IE, et al. Peripheral Eosinophilia in Patients With Inflammatory Bowel Disease Defines an Aggressive Disease Phenotype. Am J Gastroenterol. Dec 2017;112(12):1849-1858.
  4. Prathapan KM, Ramos Rivers C, Anderson A, et al. Peripheral Blood Eosinophilia and Long-term Severity in Pediatric-Onset Inflammatory Bowel Disease. Inflamm Bowel Dis. Nov 19 2020;26(12):1890-1900.
  5. Anderson A, Cherfane C, Click B, et al. Monocytosis Is a Biomarker of Severity in Inflammatory Bowel Disease: Analysis of a 6-Year Prospective Natural History Registry. Inflamm Bowel Dis. Jan 5 2022;28(1):70-78.
  6. Ertem FU, Rivers CR, Watson AR, et al. Granuloma Presence at Initial Surgery Predicts Need for Repeat Surgery Independent of Rutgeerts Score in Crohn’s Disease. Inflamm Bowel Dis. Jan 31 2023;doi:10.1093/ibd/izad008
  7. Johnson CM, Hartman DJ, Ramos-Rivers C, et al. Epithelioid Granulomas Associate With Increased Severity and Progression of Crohn’s Disease, Based on 6-Year Follow-Up. Clin Gastroenterol Hepatol. Jun 2018;16(6):900-907 e1.
  8. Ahsan M, Koutroumpakis F, Rivers CR, et al. High Sugar-Sweetened Beverage Consumption Is Associated with Increased Health Care Utilization in Patients with Inflammatory Bowel Disease: A Multiyear, Prospective Analysis. J Acad Nutr Diet. Aug 2022;122(8):1488-1498 e1.
  9. Dahar MM, Ramos Rivers C, Ahsan M, et al. Positive Gluten Sensitivity Serologies and the Impact of Gluten-free Diet in Patients With IBD. Inflamm Bowel Dis. Aug 1 2022;28(8):e122-e123.