In this paper we propose a new approach to detect clusters in undirected graphs with attributed vertices. The aim is to group vertices which are similar not only in terms of structural connectivity but also in terms of attribute values. We incorporate structural and attribute similarities between the vertices in an augmented graph by creating additional vertices and edges as proposed in [6, 38]. The augmented graph is then embedded in a Euclidean space associated to its Laplacian where a modified K-means algorithm is applied to identify clusters. The modified K-means relies on a vector distance measure where to each original vertex we assign a suitable vector-valued set of coordinates depending on both structural connectivity and attribute similarities, so that each original graph vertex is thought as representative of m + 1 vertices of the augmented graph, if m is the number of vertex attributes. To define the coordinate vectors we employ our recently proposed algorithm based on an adaptive AMG (Algebraic MultiGrid) method, which identifies the coordinate directions in the embedding Euclidean space in terms of algebraically smooth vectors with respect to the augmented graph Laplacian, and thus extending our previous result for graphs without attributes. We analyze the effectiveness of our proposed clustering method by comparison with some well known methods, whose software implementation is freely available, and also with results reported in the literature, on two different types of widely used synthetic graphs and on some real-world attributed graphs.
Network Clustering by Embedding of Attribute-augmented Graphs
D'Ambra P;
2022
Abstract
In this paper we propose a new approach to detect clusters in undirected graphs with attributed vertices. The aim is to group vertices which are similar not only in terms of structural connectivity but also in terms of attribute values. We incorporate structural and attribute similarities between the vertices in an augmented graph by creating additional vertices and edges as proposed in [6, 38]. The augmented graph is then embedded in a Euclidean space associated to its Laplacian where a modified K-means algorithm is applied to identify clusters. The modified K-means relies on a vector distance measure where to each original vertex we assign a suitable vector-valued set of coordinates depending on both structural connectivity and attribute similarities, so that each original graph vertex is thought as representative of m + 1 vertices of the augmented graph, if m is the number of vertex attributes. To define the coordinate vectors we employ our recently proposed algorithm based on an adaptive AMG (Algebraic MultiGrid) method, which identifies the coordinate directions in the embedding Euclidean space in terms of algebraically smooth vectors with respect to the augmented graph Laplacian, and thus extending our previous result for graphs without attributes. We analyze the effectiveness of our proposed clustering method by comparison with some well known methods, whose software implementation is freely available, and also with results reported in the literature, on two different types of widely used synthetic graphs and on some real-world attributed graphs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.