CNR Institutional Research Information System

This study introduces an explainable Artificial Intelligence (XAI) framework that couples legal-domain NLP with Structural Topic Modeling (STM) and WordNet semantic graphs to rigorously analyze over 1,900 GDPR enforcement decision summaries from a public dataset. Our methodology focuses on demonstrating the pipeline's validity respect to manual analyses by inspecting the results of four well-know research questions: (1) cross-country fine distribution disparities (automated metadata extraction); (2) the violation severity-fine amount relationship (keyness and semantic analysis); (3) structural text patterns (network analysis and STM); and (4) prevalent enforcement triggers (topic prevalence modeling) The pipeline's validity is underscored by its ability to replicate key findings from previous manual analyses while enabling a more nuanced exploration of GDPR enforcement trends. Our results confirm significant disparities in enforcement across EU member states and reveal that monetary penalties do not consistently correlate with violation severity. Specifically, serious infringements, particularly those involving video surveillance, frequently result in low-value fines, especially when committed by individuals or smaller entities. This highlights that a substantial proportion of severe violations are attributed to smaller actors. Methodologically, the framework's ability to quickly replicate such well-known patterns, alongside its transparency and reproducibility, establishes its potential as a scalable tool for transparent and explainable GDPR enforcement analytics.

A semantic approach to understanding GDPR fines: From text to compliance insights

Albina Orlando;Mario Santoro

2025

Abstract

This study introduces an explainable Artificial Intelligence (XAI) framework that couples legal-domain NLP with Structural Topic Modeling (STM) and WordNet semantic graphs to rigorously analyze over 1,900 GDPR enforcement decision summaries from a public dataset. Our methodology focuses on demonstrating the pipeline's validity respect to manual analyses by inspecting the results of four well-know research questions: (1) cross-country fine distribution disparities (automated metadata extraction); (2) the violation severity-fine amount relationship (keyness and semantic analysis); (3) structural text patterns (network analysis and STM); and (4) prevalent enforcement triggers (topic prevalence modeling) The pipeline's validity is underscored by its ability to replicate key findings from previous manual analyses while enabling a more nuanced exploration of GDPR enforcement trends. Our results confirm significant disparities in enforcement across EU member states and reveal that monetary penalties do not consistently correlate with violation severity. Specifically, serious infringements, particularly those involving video surveillance, frequently result in low-value fines, especially when committed by individuals or smaller entities. This highlights that a substantial proportion of severe violations are attributed to smaller actors. Methodologically, the framework's ability to quickly replicate such well-known patterns, alongside its transparency and reproducibility, establishes its potential as a scalable tool for transparent and explainable GDPR enforcement analytics.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Strutture organizzative
	
				Istituto per le applicazioni del calcolo - IAC - Sede Secondaria Napoli
			
	Parole chiave
	
				Explainable AI
XAI
Data protection
Privacy
GDPR fines
Topic modeling
Semantic analysis
NLP
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S2212473X25000598-main.pdf accesso aperto Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 3.61 MB Formato Adobe PDF Visualizza/Apri	3.61 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/562841

Citazioni

ND

ND

1

social impact