The large amount of work on community detection and its applications leaves unaddressed one important question: the statistical validation of the results. We present a methodology able to clearly detect the truly significance of the communities identified by some technique, permitting us to discard those that could be merely the consequence of edge positions in the network. Given a community detection method and a network of interest, our procedure examines the stability of the partition recovered against random perturbations of the original graph structure. To address this issue, we specify a perturbation strategy and a null model to build a stringent statistical test on a special measure of clustering distance, namely Variation of Information. The test determines if the obtained clustering departs significantly from the null model, hence strongly supporting the robustness against perturbation of the algorithm that identified the community structure. We show the results obtained with the proposed technique on simulated and real dataset.

Validation of community robustness

2014

Abstract

The large amount of work on community detection and its applications leaves unaddressed one important question: the statistical validation of the results. We present a methodology able to clearly detect the truly significance of the communities identified by some technique, permitting us to discard those that could be merely the consequence of edge positions in the network. Given a community detection method and a network of interest, our procedure examines the stability of the partition recovered against random perturbations of the original graph structure. To address this issue, we specify a perturbation strategy and a null model to build a stringent statistical test on a special measure of clustering distance, namely Variation of Information. The test determines if the obtained clustering departs significantly from the null model, hence strongly supporting the robustness against perturbation of the algorithm that identified the community structure. We show the results obtained with the proposed technique on simulated and real dataset.
2014
Istituto Applicazioni del Calcolo ''Mauro Picone''
978-84-937822-4-5
C
community detection · networks · variation of information · multiple testing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/328009
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact