Biological networks are representative of the diverse molecular interactions that occur within cells. Some of the commonly studied biological networks are modeled through protein-protein interactions, gene regulatory and metabolic pathways. Among all, metabolic networks are probably the best studied, since they directly influence all physiological processes. Exploration of biochemical pathways using graph representation is important in understanding complex regulatory mechanisms. Feature extraction and clustering of these networks enable grouping of samples obtained from different biological specimens. Clustering techniques separate networks depending on the similarity between a set of graphs. Each clustering algorithm has benefits and limitations, and one performs over the other based on the nature of the dataset. DBSCAN, spectral clustering and affinity propagation are some of the algorithms used for clustering. We present a comparative analysis of the performance of several clustering algorithms on tissue-specific metabolic networks of samples from three primary tumor sites: breast (GEO database: GSE78958), lung (TCGA-LUSC and TCGA-LUAD) and kidney (TCGA-KIRC and TCGA-KIRP) cancer. The metabolic networks were obtained by integrating metabolic models with gene expression data. We extracted different distance measures between networks of tumor samples and assessed the ability of these algorithms to assign tumor subtypes into clusters. We also performed feature selection for distance measurement to optimize clustering. We aim to identify the algorithms that achieve optimal between sample clustering of the metabolic networks. This study also allows us to understand the biological features that are most important to detect similarity within groups and also those that allow separation of tumor subtypes.
Clustering Analysis of Tumor Metabolic Networks
I Manipur;I Granata;L Maddalena;KP Tripathi;
2018
Abstract
Biological networks are representative of the diverse molecular interactions that occur within cells. Some of the commonly studied biological networks are modeled through protein-protein interactions, gene regulatory and metabolic pathways. Among all, metabolic networks are probably the best studied, since they directly influence all physiological processes. Exploration of biochemical pathways using graph representation is important in understanding complex regulatory mechanisms. Feature extraction and clustering of these networks enable grouping of samples obtained from different biological specimens. Clustering techniques separate networks depending on the similarity between a set of graphs. Each clustering algorithm has benefits and limitations, and one performs over the other based on the nature of the dataset. DBSCAN, spectral clustering and affinity propagation are some of the algorithms used for clustering. We present a comparative analysis of the performance of several clustering algorithms on tissue-specific metabolic networks of samples from three primary tumor sites: breast (GEO database: GSE78958), lung (TCGA-LUSC and TCGA-LUAD) and kidney (TCGA-KIRC and TCGA-KIRP) cancer. The metabolic networks were obtained by integrating metabolic models with gene expression data. We extracted different distance measures between networks of tumor samples and assessed the ability of these algorithms to assign tumor subtypes into clusters. We also performed feature selection for distance measurement to optimize clustering. We aim to identify the algorithms that achieve optimal between sample clustering of the metabolic networks. This study also allows us to understand the biological features that are most important to detect similarity within groups and also those that allow separation of tumor subtypes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.