GPU computing, nowadays widely and readily available on the cloud, has opened up novel opportunities for the parallelization of computationally-intensive tasks, such as data anonymization. The development of effective techniques that help to guarantee data anonymity is a critical enabler for data sharing activities, as well as to enforce compliance-think about the European GDPR. In this scenario, we focus on personal data stored in microdata sets. Before releasing such microdata to the general public, statistical agencies and the like have to sanitize them by using a variety of Microdata Protection Techniques (MPTs)that aim at keeping data utility while preserving some kind of anonymity. In particular, microaggregation is a specific MPT arisen in the field of statistical disclosure control. We analyze the microaggregation anonymization issues and propose three GPU-based parallel approaches for a well-known microaggregation technique: the Maximum Distance to Average Vector (MDAV)algorithm. The experimental results demonstrate the feasibility of our proposal and emphasize the benefits of using GPUs to speed-up the execution of privacy preserving algorithms for microdata.

GPU Algorithms for K-Anonymity in Microdata

F Lombardi;
2019

Abstract

GPU computing, nowadays widely and readily available on the cloud, has opened up novel opportunities for the parallelization of computationally-intensive tasks, such as data anonymization. The development of effective techniques that help to guarantee data anonymity is a critical enabler for data sharing activities, as well as to enforce compliance-think about the European GDPR. In this scenario, we focus on personal data stored in microdata sets. Before releasing such microdata to the general public, statistical agencies and the like have to sanitize them by using a variety of Microdata Protection Techniques (MPTs)that aim at keeping data utility while preserving some kind of anonymity. In particular, microaggregation is a specific MPT arisen in the field of statistical disclosure control. We analyze the microaggregation anonymization issues and propose three GPU-based parallel approaches for a well-known microaggregation technique: the Maximum Distance to Average Vector (MDAV)algorithm. The experimental results demonstrate the feasibility of our proposal and emphasize the benefits of using GPUs to speed-up the execution of privacy preserving algorithms for microdata.
2019
data privacy
microdata
security
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/361767
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact