Random sampling of the feasible region defined by knowledge-based and data-driven constraints is being increasingly employed for the analysis of metabolic networks. The aim is to identify a set of reactions that are used at a significantly different extent between two conditions of biological interest, such as physiological and pathological conditions. A reference constraint-based model incorporating knowledge-based constraints on reaction stoichiometry and a reasonable mass balance constraint is thus deferentially constrained for the two conditions according to different types of -omics data, such as transcriptomics and/or proteomics. The hypothesis that two samples randomly obtained from the two models come from the same distribution is then rejected/confirmed according to standard statistical tests. However, the impact of under-sampling on false discoveries has not been investigated so far. To this aim, we evaluated the presence of false discoveries by comparing samples obtained from the very same feasible region, for which the null hypothesis must be confirmed. We compared different sampling algorithms and sampling parameters. Our results indicate that established sampling convergence tests are not sufficient to prevent high false discovery rates. We propose some best practices to reduce the false discovery rate. We advocate the usage of the CHRR algorithm, a large value of the thinning parameter, and a threshold on the fold-change between the averages of the sampled flux values.

Best Practices in Flux Sampling of Constrained-Based Models

Galuzzi B. G.;
2023

Abstract

Random sampling of the feasible region defined by knowledge-based and data-driven constraints is being increasingly employed for the analysis of metabolic networks. The aim is to identify a set of reactions that are used at a significantly different extent between two conditions of biological interest, such as physiological and pathological conditions. A reference constraint-based model incorporating knowledge-based constraints on reaction stoichiometry and a reasonable mass balance constraint is thus deferentially constrained for the two conditions according to different types of -omics data, such as transcriptomics and/or proteomics. The hypothesis that two samples randomly obtained from the two models come from the same distribution is then rejected/confirmed according to standard statistical tests. However, the impact of under-sampling on false discoveries has not been investigated so far. To this aim, we evaluated the presence of false discoveries by comparing samples obtained from the very same feasible region, for which the null hypothesis must be confirmed. We compared different sampling algorithms and sampling parameters. Our results indicate that established sampling convergence tests are not sufficient to prevent high false discovery rates. We propose some best practices to reduce the false discovery rate. We advocate the usage of the CHRR algorithm, a large value of the thinning parameter, and a threshold on the fold-change between the averages of the sampled flux values.
2023
Istituto di Bioimmagini e Sistemi Biologici Complessi (IBSBC)
978-3-031-25890-9
Constrained-based modelling
Flux sampling
Metabolic network
File in questo prodotto:
File Dimensione Formato  
Galluzzi-2023-Lect Notes Comput Sci-VoR.pdf

non disponibili

Licenza: Nessuna licenza dichiarata (non attribuibile a prodotti successivi al 2023)
Dimensione 4.54 MB
Formato Adobe PDF
4.54 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/539286
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact