The field of statistical disclosure control aims to reduce the risk of re-identifying an individual from disseminated data, a major concern among national statistical agencies. Operations Research (OR) techniques have been widely used in the past for protecting tabular data, but not microdata (i.e., files of individuals and attributes). Few papers apply OR techniques to the microaggregation problem, which is considered one of the best methods for microdata protection and is known to be NP-hard. The new heuristic approach is based on a column generation scheme and, unlike previous (primal) heuristics for microaggregation, it also provides a lower bound on the optimal microaggregation. Using real data that is typically used in the literature, our computational results show, first, that solutions with small gaps are often achieved and, second, that dramatic improvements are obtained relative to the literature's most popular heuristics.

An algorithm for the microaggregation problem using column generation

Gentile C;
2022

Abstract

The field of statistical disclosure control aims to reduce the risk of re-identifying an individual from disseminated data, a major concern among national statistical agencies. Operations Research (OR) techniques have been widely used in the past for protecting tabular data, but not microdata (i.e., files of individuals and attributes). Few papers apply OR techniques to the microaggregation problem, which is considered one of the best methods for microdata protection and is known to be NP-hard. The new heuristic approach is based on a column generation scheme and, unlike previous (primal) heuristics for microaggregation, it also provides a lower bound on the optimal microaggregation. Using real data that is typically used in the literature, our computational results show, first, that solutions with small gaps are often achieved and, second, that dramatic improvements are obtained relative to the literature's most popular heuristics.
2022
Istituto di Analisi dei Sistemi ed Informatica ''Antonio Ruberti'' - IASI
Integer Programming
Column Generation
Data Privacy
Custering
Microaggregation
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/417740
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact