Mathematical models are usually built and validated in the context of studies focused on a specific molecular layer (typically signaling, metabolism, or gene regulation) and based on a single type of omic data (transcriptomic, metabolic, (phospho)proteomic). However, in the biological reality, the distinction between these molecular layers disappear as proteins, metabolites and nucleic acids are all deeply interacting and influencing each others. Therefore, in order to better understand the interplay among these processes , it is critical to develop models that are able to span over multiple omics layers and account for trans-omic interactions.
Schematic representation of the interactions between signaling pathways, metabolism and gene regulation the result in specific phenotypes.
We are specifically focusing on trying to bridge together signaling pathways and metabolism through the development of models integrating transcriptomic, metabolomic, proteomic and phosphoproteomic data. To this end, we developed COSMOS (Aurelien et al. 2020), a pipeline that aims to integrate these different types of omics with footprint-based analysis and causal reasoning.
The first step of such an approach usually requires the extraction of functional information from abundances measurements available in omic data sets. For example, it is possible to estimate kinase, transcription factor (TF) and pathway activities. To do so, phosphoproteomic and transcriptomic data is combined with prior knowledge. This knowledge captures the possible interactions between TF/kinases/pathways found in resources developed in our lab or elsewhere such as Dorothea(Garcia-Alonso et al. 2018), Omnipath(Türei, Korcsmáros, and Saez-Rodriguez 2016) and Progeny(Schubert et al. 2018). Then, these enzyme/pathway activities can be correlated between themselves or with other omic features. We have recently used this approach (Gonçalves et al. 2018, 2017) to discover new metabolic regulation mechanism in yeast and cancer cell lines. Recently, we started developing methodologies to expand these activity estimations to metabolic enzymes.
Once enzyme activities and molecular entity abundances have been characterised, the second step aims at finding causal links between them.This can be done with the help of tools based on graph theory or with logic modelling tools such as PHONEMeS (Terfve et al. 2015)) and CARNIVAL(Liu et al. 2019). This strategy allows to systematically generate context specific networks and hypotheses to explain observed molecular changes across multiple omic layers. Thus, we work in close collaboration with biological experimentalist to validate these hypotheses. We are currently working on the application of such strategy in the context of cancer and fibrosis.
Workflow for footprint based multi-omic integration. On the left, statistical enrichment analysis is used to estimate activity of kinases, transcription factors and pathways. Then, multiple type of omic data can be connected together an with these activities by correlation/regression methods. They can also be combined with prior knowledge network through contextualisation methods (ILP, graph theory and mapping). Finally, the network contextualisation method and correlation base method output can be used independently or combined together to generate multi-omic context specific networks. Figure from (Dugourd and Saez-Rodriguez 2019, under CC4.0).
Research in this area was partially funded by the European H2020 MCSA Innovative Training Network SyMBioSys