In Computational Biology, several research efforts are devoted to the development of efficient techniques to process and analyse large amount of biological data and their relations. In this context, one of the main problems is how to reduce the complexity of biological networks, represented as graphs, through projections and/or transformation into a more manageable data space. Graph Embedding (GE) techniques pursue this scope, by translating large and complex graphs into a reduced vector space called ''latent space''. Several GE techniques have been proposed in literature, which rely on different approaches, such as graph kernels, graph traversal and random walks, and deep learning. In this talk we present a new GE method, called Netpro2vec. It exploits graph node proximity information to transform graphs into textual documents while preserving their significant structural properties. Netpro2vec relies on an NLP learning model to extract, from each document-based graph, the meaningful features in terms of vectors, i.e. the ''embeddings''. Such a new graph representation can be used for different machine learning tasks, such as, unsupervised clustering and supervised classification of graphs. The advantage of Netpro2vec is that it provides efficient embeddings completely independent from the task and nature of the data.

Graph Embedding for Biological Networks

Maurizio Giordano;Ilaria Granata;Mario Rosario Guarracino;Lucia Maddalena;
2021

Abstract

In Computational Biology, several research efforts are devoted to the development of efficient techniques to process and analyse large amount of biological data and their relations. In this context, one of the main problems is how to reduce the complexity of biological networks, represented as graphs, through projections and/or transformation into a more manageable data space. Graph Embedding (GE) techniques pursue this scope, by translating large and complex graphs into a reduced vector space called ''latent space''. Several GE techniques have been proposed in literature, which rely on different approaches, such as graph kernels, graph traversal and random walks, and deep learning. In this talk we present a new GE method, called Netpro2vec. It exploits graph node proximity information to transform graphs into textual documents while preserving their significant structural properties. Netpro2vec relies on an NLP learning model to extract, from each document-based graph, the meaningful features in terms of vectors, i.e. the ''embeddings''. Such a new graph representation can be used for different machine learning tasks, such as, unsupervised clustering and supervised classification of graphs. The advantage of Netpro2vec is that it provides efficient embeddings completely independent from the task and nature of the data.
2021
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
graph embedding
biological networks
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/429798
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact