Essential genes (EGs) are fundamental for the growth and survival of a cellor an organism. Identifying EGs is an important issue in many areas of biomedicalresearch, such as synthetic and system biology, drug development, mechanistic andtherapeutic investigations. The essentiality is a context-dependent dynamic attributeof a gene that can vary in different cells, tissues, or pathological conditions, and wetlabexperimental procedures to identify EGs are costly and time-consuming. Commonlyexplored computational approaches are based on machine learning techniquesapplied to protein-protein interaction networks, but they are often unsuccessful, especiallyin the case of human genes. From a biological point of view, the identificationof the node essentiality attributes is a challenging task. Nevertheless, from a datascience perspective, suitable graph learning approaches still represent an open problem.Node classification in graph modeling/analysis is a machine learning task topredict an unknown node property based on defined node attributes. The model istrained based on both the relationship information and the node attributes. Here, wepropose the use of a context-specific integrated network enriched with biologicaland topological attributes. To tackle the node classification task we exploit differentmachine and deep learning models. An extensive experimental phase demonstratesthe effectiveness of both network structure and attributes associated with the nodesfor EGs identification.

Novel Data Science Methodologies for Essential Genes Identification Based on Network Analysis

Giordano M;Maddalena L;M. R. Guarracino;Granata I
2023

Abstract

Essential genes (EGs) are fundamental for the growth and survival of a cellor an organism. Identifying EGs is an important issue in many areas of biomedicalresearch, such as synthetic and system biology, drug development, mechanistic andtherapeutic investigations. The essentiality is a context-dependent dynamic attributeof a gene that can vary in different cells, tissues, or pathological conditions, and wetlabexperimental procedures to identify EGs are costly and time-consuming. Commonlyexplored computational approaches are based on machine learning techniquesapplied to protein-protein interaction networks, but they are often unsuccessful, especiallyin the case of human genes. From a biological point of view, the identificationof the node essentiality attributes is a challenging task. Nevertheless, from a datascience perspective, suitable graph learning approaches still represent an open problem.Node classification in graph modeling/analysis is a machine learning task topredict an unknown node property based on defined node attributes. The model istrained based on both the relationship information and the node attributes. Here, wepropose the use of a context-specific integrated network enriched with biologicaland topological attributes. To tackle the node classification task we exploit differentmachine and deep learning models. An extensive experimental phase demonstratesthe effectiveness of both network structure and attributes associated with the nodesfor EGs identification.
2023
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Napoli
978-3-031-24453-7
Data science
Node classification
Essential genes identification
Integrated network
File in questo prodotto:
File Dimensione Formato  
Manzo2023DataScience.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 468.96 kB
Formato Adobe PDF
468.96 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/463354
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact