Essential genes (EGs) are fundamental for the growth and survival of a cellor an organism. Identifying EGs is an important issue in many areas of biomedicalresearch, such as synthetic and system biology, drug development, mechanistic andtherapeutic investigations. The essentiality is a context-dependent dynamic attributeof a gene that can vary in different cells, tissues, or pathological conditions, and wetlabexperimental procedures to identify EGs are costly and time-consuming. Commonlyexplored computational approaches are based on machine learning techniquesapplied to protein-protein interaction networks, but they are often unsuccessful, especiallyin the case of human genes. From a biological point of view, the identificationof the node essentiality attributes is a challenging task. Nevertheless, from a datascience perspective, suitable graph learning approaches still represent an open problem.Node classification in graph modeling/analysis is a machine learning task topredict an unknown node property based on defined node attributes. The model istrained based on both the relationship information and the node attributes. Here, wepropose the use of a context-specific integrated network enriched with biologicaland topological attributes. To tackle the node classification task we exploit differentmachine and deep learning models. An extensive experimental phase demonstratesthe effectiveness of both network structure and attributes associated with the nodesfor EGs identification.
Novel Data Science Methodologies for Essential Genes Identification Based on Network Analysis
Giordano M;Maddalena L;M. R. Guarracino;Granata I
2023
Abstract
Essential genes (EGs) are fundamental for the growth and survival of a cellor an organism. Identifying EGs is an important issue in many areas of biomedicalresearch, such as synthetic and system biology, drug development, mechanistic andtherapeutic investigations. The essentiality is a context-dependent dynamic attributeof a gene that can vary in different cells, tissues, or pathological conditions, and wetlabexperimental procedures to identify EGs are costly and time-consuming. Commonlyexplored computational approaches are based on machine learning techniquesapplied to protein-protein interaction networks, but they are often unsuccessful, especiallyin the case of human genes. From a biological point of view, the identificationof the node essentiality attributes is a challenging task. Nevertheless, from a datascience perspective, suitable graph learning approaches still represent an open problem.Node classification in graph modeling/analysis is a machine learning task topredict an unknown node property based on defined node attributes. The model istrained based on both the relationship information and the node attributes. Here, wepropose the use of a context-specific integrated network enriched with biologicaland topological attributes. To tackle the node classification task we exploit differentmachine and deep learning models. An extensive experimental phase demonstratesthe effectiveness of both network structure and attributes associated with the nodesfor EGs identification.File | Dimensione | Formato | |
---|---|---|---|
Manzo2023DataScience.pdf
solo utenti autorizzati
Tipologia:
Versione Editoriale (PDF)
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
468.96 kB
Formato
Adobe PDF
|
468.96 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.