The domains registration is an essential activity, allowing the final users to have a domain name on Internet. The "Registro .it" is the Registry of .it Internet domains, the 6th national Registry at European level and the 10th in the world, in terms of domain names, with about 3.5 million registered domain names. Registro .it must ensure the correctness of the data saved into its database, where on the one hand, there are European laws that oblige the registries to keep domain name registration data complete and accurate; on the other hand, it is common for malicious registrants to save false information used to throw cyber-attacks. For these reasons, we propose READS (REgistrant Anomalies Detection System) an anomaly detection system able to annotate false or incoherent registrant domain name information, to make the correction feasible for the registry operators. The system is designed with scalability and modularity as key features, it uses NLP (Natural Language Processing) techniques, machine learning, heuristics, and the external services to annotate anomalies found among the registrants data. Registry operators can analyse the annotations made, validate them, and propose them to the registrar operators for correction. The anomalies and the corrected data will be used to collect a dataset to build a supervised anomaly detection system.

READS: REgistrant Anomalies Detection System - An Anomaly Detection System of the .it Registry database

D Sartiano;A Del Soldato
2022

Abstract

The domains registration is an essential activity, allowing the final users to have a domain name on Internet. The "Registro .it" is the Registry of .it Internet domains, the 6th national Registry at European level and the 10th in the world, in terms of domain names, with about 3.5 million registered domain names. Registro .it must ensure the correctness of the data saved into its database, where on the one hand, there are European laws that oblige the registries to keep domain name registration data complete and accurate; on the other hand, it is common for malicious registrants to save false information used to throw cyber-attacks. For these reasons, we propose READS (REgistrant Anomalies Detection System) an anomaly detection system able to annotate false or incoherent registrant domain name information, to make the correction feasible for the registry operators. The system is designed with scalability and modularity as key features, it uses NLP (Natural Language Processing) techniques, machine learning, heuristics, and the external services to annotate anomalies found among the registrants data. Registry operators can analyse the annotations made, validate them, and propose them to the registrar operators for correction. The anomalies and the corrected data will be used to collect a dataset to build a supervised anomaly detection system.
2022
Istituto di informatica e telematica - IIT
Domain Names
Big Data Analysis
Machine Learning
Deep Learning
Cybersecurity
Security Services
Anomaly Detection
Natural Language Processing
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/445785
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact