Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages.
Novel benchmark for NER in the wastewater and stormwater domain
Cardillo F. A.;Debole F.;Frontini F.;
2025
Abstract
Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages.| File | Dimensione | Formato | |
|---|---|---|---|
|
main.pdf
solo utenti autorizzati
Descrizione: Novel Benchmark for NER in the Wastewater and Stormwater Domain
Tipologia:
Documento in Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
112.99 kB
Formato
Adobe PDF
|
112.99 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


