Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages.
Novel benchmark for NER in the wastewater and stormwater domain
Cardillo F. A.;Debole F.;Frontini F.;
2025
Abstract
Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.anceserie | COLLOQUIUM IN INFORMATION SCIENCE AND TECHNOLOGY | en |
| dc.authority.orgunit | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | en |
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Cardillo F. A. | en |
| dc.authority.people | Debole F. | en |
| dc.authority.people | Frontini F. | en |
| dc.authority.people | Aelami M. | en |
| dc.authority.people | Chahinian N. | en |
| dc.authority.people | Conrad S. | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.appartenenza.mi | 973 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2026/01/15 12:16:26 | - |
| dc.date.available | 2026/01/15 12:16:26 | - |
| dc.date.firstsubmission | 2026/01/13 22:43:58 | * |
| dc.date.issued | 2025 | - |
| dc.date.submission | 2026/01/13 22:43:58 | * |
| dc.description.abstracteng | Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages. | - |
| dc.description.allpeople | Cardillo, F. A.; Debole, F.; Frontini, F.; Aelami, M.; Chahinian, N.; Conrad, S. | - |
| dc.description.allpeopleoriginal | Cardillo F.A.; Debole F.; Frontini F.; Aelami M.; Chahinian N.; Conrad S. | en |
| dc.description.fulltext | restricted | en |
| dc.description.numberofauthors | 6 | - |
| dc.identifier.doi | 10.1109/cist65886.2025.11224095 | en |
| dc.identifier.isbn | 979-8-3315-4384-6 | en |
| dc.identifier.scopus | 2-s2.0-105024952471 | en |
| dc.identifier.source | orcid | * |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/562981 | - |
| dc.identifier.url | https://ieeexplore.ieee.org/document/11224095 | en |
| dc.language.iso | eng | en |
| dc.publisher.country | USA | en |
| dc.publisher.name | Institute of Electrical and Electronics Engineers | en |
| dc.relation.conferencedate | 2025 | en |
| dc.relation.conferencename | Cist 2025 - 8th IEEE International Congress on Information Science and Technology | en |
| dc.relation.conferenceplace | Marrakech, Morocco | en |
| dc.relation.firstpage | 226 | en |
| dc.relation.ispartofbook | Cist 2025 proceedings | en |
| dc.relation.lastpage | 231 | en |
| dc.relation.medium | ELETTRONICO | en |
| dc.relation.numberofpages | 6 | en |
| dc.subject.keywordseng | Annotation projection | - |
| dc.subject.keywordseng | Domain-specific corpus | - |
| dc.subject.keywordseng | LLMs for NER | - |
| dc.subject.keywordseng | Multilingual NLP | - |
| dc.subject.keywordseng | Named Entity Recognition | - |
| dc.subject.singlekeyword | Annotation projection | * |
| dc.subject.singlekeyword | Domain-specific corpus | * |
| dc.subject.singlekeyword | LLMs for NER | * |
| dc.subject.singlekeyword | Multilingual NLP | * |
| dc.subject.singlekeyword | Named Entity Recognition | * |
| dc.title | Novel benchmark for NER in the wastewater and stormwater domain | en |
| dc.type.circulation | Internazionale | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| iris.mediafilter.data | 2026/01/16 02:41:38 | * |
| iris.orcid.lastModifiedDate | 2026/01/15 12:27:48 | * |
| iris.orcid.lastModifiedMillisecond | 1768476468876 | * |
| iris.scopus.extIssued | 2025 | - |
| iris.scopus.extTitle | Novel Benchmark for NER in the Wastewater and Stormwater Domain | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.doi | 10.1109/cist65886.2025.11224095 | * |
| iris.unpaywall.isoa | false | * |
| iris.unpaywall.metadataCallLastModified | 16/01/2026 03:34:06 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1768530846275 | - |
| iris.unpaywall.oastatus | closed | * |
| scopus.category | 1711 | * |
| scopus.category | 1706 | * |
| scopus.category | 1803 | * |
| scopus.category | 1802 | * |
| scopus.contributor.affiliation | Consiglio Nazionale Delle Ricerche | - |
| scopus.contributor.affiliation | Consiglio Nazionale Delle Ricerche | - |
| scopus.contributor.affiliation | Consiglio Nazionale Delle Ricerche | - |
| scopus.contributor.affiliation | Inria | - |
| scopus.contributor.affiliation | Cnrs | - |
| scopus.contributor.affiliation | Cnrs | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60085207 | - |
| scopus.contributor.afid | 60008941 | - |
| scopus.contributor.afid | 60108488 | - |
| scopus.contributor.afid | 60108488 | - |
| scopus.contributor.afid | 60108488 | - |
| scopus.contributor.auid | 57191090133 | - |
| scopus.contributor.auid | 22333451000 | - |
| scopus.contributor.auid | 55162070400 | - |
| scopus.contributor.auid | 60012687800 | - |
| scopus.contributor.auid | 8625087900 | - |
| scopus.contributor.auid | 58672437800 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | France | - |
| scopus.contributor.country | France | - |
| scopus.contributor.country | France | - |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.dptid | - | |
| scopus.contributor.name | Franco Alberto | - |
| scopus.contributor.name | Franca | - |
| scopus.contributor.name | Francesca | - |
| scopus.contributor.name | Mitra | - |
| scopus.contributor.name | Nanee | - |
| scopus.contributor.name | Serge | - |
| scopus.contributor.subaffiliation | Ist. di Linguistica Computazionale; | - |
| scopus.contributor.subaffiliation | Ist. di Scienza e Tecnologie Dell'Informazione; | - |
| scopus.contributor.subaffiliation | Ist. di Linguistica Computazionale; | - |
| scopus.contributor.subaffiliation | Hsm Univ. Montpellier;Cnrs;Ird; | - |
| scopus.contributor.subaffiliation | Hsm Univ Montpellier;Ird; | - |
| scopus.contributor.subaffiliation | Hsm Univ Montpellier;Ird; | - |
| scopus.contributor.surname | Cardillo | - |
| scopus.contributor.surname | Debole | - |
| scopus.contributor.surname | Frontini | - |
| scopus.contributor.surname | Aelami | - |
| scopus.contributor.surname | Chahinian | - |
| scopus.contributor.surname | Conrad | - |
| scopus.date.issued | 2025 | * |
| scopus.description.abstracteng | Efficient wastewater and stormwater management is mandatory for sustainable cities. Extracting structured knowledge from reports and regulations is challenging due to domain-specific terminology and multilingual contexts. This work focuses on domain-specific Named Entity Recognition (NER) as a first step towards effective relation and information extraction to support decision making. A multilingual benchmark is crucial for evaluating these methods. This study develops a French-Italian domain-specific text corpus for wastewater management. It evaluates state-of-the-art NER methods, including LLM-based approaches, to provide a reliable baseline for future strategies and explores automated annotation projection in view of an extension of the corpus to new languages. | * |
| scopus.description.allpeopleoriginal | Cardillo F.A.; Debole F.; Frontini F.; Aelami M.; Chahinian N.; Conrad S. | * |
| scopus.differences | scopus.publisher.name | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.relation.conferencename | * |
| scopus.differences | scopus.identifier.isbn | * |
| scopus.differences | scopus.relation.conferenceplace | * |
| scopus.document.type | cp | * |
| scopus.document.types | cp | * |
| scopus.funding.funders | 501100014596 - Istituto di Scienza e Tecnologie dell'Informazione; 501100000780 - European Commission; 501100001665 - Agence Nationale de la Recherche; | * |
| scopus.funding.ids | GA 101086252; GA ANR-21-CE23-0004; | * |
| scopus.identifier.doi | 10.1109/CiSt65886.2025.11224095 | * |
| scopus.identifier.eissn | 2327-1884 | * |
| scopus.identifier.isbn | 9798331543846 | * |
| scopus.identifier.pui | 649556157 | * |
| scopus.identifier.scopus | 2-s2.0-105024952471 | * |
| scopus.journal.sourceid | 21100400809 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | Institute of Electrical and Electronics Engineers Inc. | * |
| scopus.relation.conferencedate | 2025 | * |
| scopus.relation.conferencename | 8th IEEE International Congress on Information Science and Technology, CiSt 2025 | * |
| scopus.relation.conferenceplace | mar | * |
| scopus.relation.firstpage | 226 | * |
| scopus.relation.lastpage | 231 | * |
| scopus.subject.keywords | Annotation projection; Domain-specific corpus; LLMs for NER; Multilingual NLP; Named Entity Recognition; | * |
| scopus.title | Novel Benchmark for NER in the Wastewater and Stormwater Domain | * |
| scopus.titleeng | Novel Benchmark for NER in the Wastewater and Stormwater Domain | * |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
main.pdf
solo utenti autorizzati
Descrizione: Novel Benchmark for NER in the Wastewater and Stormwater Domain
Tipologia:
Documento in Pre-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
112.99 kB
Formato
Adobe PDF
|
112.99 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


