UIOBigHealth: improving causal inference using machine learning and big data analytics on real-world health data

Contact person: Hedvig Marie Egeland Nordeng
Keywords: Computational Life Sciences, Machine learning, Big Data analytics, Pharmacoepidemiology, Causal inference, Pregnancy
Research groups: PharmaSafe, UIORealArt
Department of Pharmacy, Department of Informatics

Norway is home to some of the world's most comprehensive and well-maintained health care registries and birth cohorts. These contain valuable data on patient demographics, diagnoses, treatments, and outcomes including million of patients as well as generic/epigenetic data from the MoBa biobank. To date, 98 000 samples (trios) have been analyzed in genome-wide association studies (GWAS), and are available to us. Our projects unite researchers who together aim to move beyond conventional pharmacoepidemiological studies by combining novel, state-of-the-art computational, big data analytics with causal inference, epidemiology and clinical expertise.

By using machine learning (ML) and big data analytics to turn complex epidemiological, genetic and clinical data into testable hypotheses, we will be able to facilitate scientific discovery of novel genetic and clinical risk factors for drug toxicity and adverse health outcomes. Moreover, if we are able to demonstrate how ML can improve causal inference in real-world observational studies, it will be an important driver for the uptake of ML in the field of precision medicine. This represent a tremendous opportunity for researchers to gain insight into the effectiveness and safety of pharmaceuticals for patients in real-world settings.

This research has become more relevant than ever as The European Medicines Agency's Data Analysis and Real-World Interrogation Network (DARWIN) program has been launched to provide timely and reliable evidence on the use, safety and effectiveness of medicines for human use, including vaccines, from real world healthcare databases across the European Union (EU).

Applied research topics:

  • Genetic confounding pharmacoepidemiology
  • Bioinformatics for omic-integration of pharmacogenetic and -epigenetic data (UiORealArt).
  • Improving generalizability, data efficacy and transfer learning by having knowledge about the underlying causality.

Methodological research topics:

  • Methods and Applications of Synthetic Data in Pharmacoepidemiology
  • Improving reproducibility of register-based pharmaco-epidemiological research using supporting tools to standardize study designs and data analyses.

External partners:

  • Division of Clinical Neuroscience, Oslo University Hospital
  • Radboud University Medical Center, The Netherlands