CNR Institutional Research Information System

Background: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective di-agnostic and prognostic strategies. Methods: We explore the possibility of exploiting the topological properties of sample-specific met-abolic networks as features in a supervised classification task. Such networks are obtained by pro-jecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. Results: We show the classification results on a labeled breast cancer dataset from the TCGA data-base, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effec-tive choice to recover useful information while filtering out noise from data. Overall, the best accu-racy is achieved with SVMs, which exhibit performances similar to those obtained when gene ex-pression profiles are used as features. Conclusion: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.

On the Use of Topological Features of Metabolic Networks for the Classification of Cancer Samples

Machicao J;Craighero F;Maspero D;Angaroni F;Damiani C;Graudenzi A;Antoniotti M;Bruno OM

2021

Abstract

Background: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective di-agnostic and prognostic strategies. Methods: We explore the possibility of exploiting the topological properties of sample-specific met-abolic networks as features in a supervised classification task. Such networks are obtained by pro-jecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. Results: We show the classification results on a labeled breast cancer dataset from the TCGA data-base, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effec-tive choice to recover useful information while filtering out noise from data. Overall, the best accu-racy is achieved with SVMs, which exhibit performances similar to those obtained when gene ex-pression profiles are used as features. Conclusion: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Strutture organizzative
	
				Istituto di Bioimmagini e Fisiologia Molecolare - IBFM
			
	Parole chiave
	
				Metabolic networks
cancer sample classification
machine learning
RNA-seq data
topological properties
network pruning.
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/397742

Citazioni

ND

ND

5

social impact