<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="static/CINECAstyle.xsl"?><OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"><responseDate>2026-06-17T13:41:32Z</responseDate><request verb="GetRecord" identifier="oai:iris.cnr.it:20.500.14243/570763" metadataPrefix="oai_dc">https://iris.cnr.it/oai/request</request><GetRecord><record><header><identifier>oai:iris.cnr.it:20.500.14243/570763</identifier><datestamp>2026-04-19T07:03:56Z</datestamp><setSpec>com_20.500.14243_46</setSpec><setSpec>com_20.500.14243_21</setSpec><setSpec>col_20.500.14243_47</setSpec><setSpec>ou_ou239</setSpec></header><metadata><oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:doc="http://www.lyncode.com/xoai" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title>A Novel Real-World Dataset of Italian Clinical Notes for NLP-based Decision Support in Low Back Pain Treatment</dc:title>
<dc:creator>Bonfigli, Agnese</dc:creator>
<dc:creator>Piperno, Ruben</dc:creator>
<dc:creator>Bacco Luca</dc:creator>
<dc:creator>Dell'Orletta, Felice</dc:creator>
<dc:creator>Brunato, Dominique</dc:creator>
<dc:creator>Crispino, Filippo</dc:creator>
<dc:creator>Papalia, Giuseppe Francesco</dc:creator>
<dc:creator>Russo, Fabrizio</dc:creator>
<dc:creator>Vadalà, Gianluca</dc:creator>
<dc:creator>Papalia, Rocco</dc:creator>
<dc:creator>Merone, Mario</dc:creator>
<dc:creator>Pecchia, Leandro</dc:creator>
<dc:contributor>Bonfigli, Agnese</dc:contributor>
<dc:contributor> Piperno, Ruben</dc:contributor>
<dc:contributor> Bacco, Luca</dc:contributor>
<dc:contributor> Dell'Orletta, Felice</dc:contributor>
<dc:contributor> Brunato, Dominique</dc:contributor>
<dc:contributor> Crispino, Filippo</dc:contributor>
<dc:contributor> Papalia, Giuseppe Francesco</dc:contributor>
<dc:contributor> Russo, Fabrizio</dc:contributor>
<dc:contributor> Vadalà, Gianluca</dc:contributor>
<dc:contributor> Papalia, Rocco</dc:contributor>
<dc:contributor> Merone, Mario</dc:contributor>
<dc:contributor> Pecchia, Leandro</dc:contributor>
<dc:subject>NLP in healthcare</dc:subject>
<dc:subject>Large Language Models (LLMs)</dc:subject>
<dc:subject>Italian Medical Corpus</dc:subject>
<dc:description>Low back pain represents a leading source of disability worldwide and poses a significant challenge for evidence-based clinical decision support. In contexts where Italian-language resources for diversified therapeutic pathways are lacking, we have assembled a novel, annotated dataset comprising up to three pre-treatment documents per patient (MRI report, X-ray report, and patient visit notes), alongside demographic information (age and sex). The cohort consists of 176 patient records, stratified into three therapeutic groups: 50 conservative, 92 regenerative, and 34 surgical. The primary aim is to investigate whether the collected dataset can be harnessed to predict which of the three treatment modalities is most appropriate. To this end, six document-combination scenarios were defined, evaluating each single-report modality as well as all possible pairings. For each scenario, two modeling strategies were contrasted: a traditional Support Vector Machine classifier leveraging TF–IDF features based on unigrams, bigrams, and trigrams, and a fine-tuned Italian BERT model adapted to our corpus. Experimental results indicate that classic n-gram–based approaches achieve the highest performance (macro–F1 up to 71.3%). The BERT model, while outperforming the baseline, encounters limitations in this low-resource scenario.These findings suggest that the present dataset has the potential to catalyze the development of Italian-language clinical decision support systems that account for the distinct signatures of treatment pathways.</dc:description>
<dc:date>2025</dc:date>
<dc:type>info:eu-repo/semantics/conferenceObject</dc:type>
<dc:identifier>https://hdl.handle.net/20.500.14243/570763</dc:identifier>
<dc:language>eng</dc:language>
<dc:relation>ispartofbook:Proceedings of the Eleventh Italian Conference on Computational Linguistics (CLiC-it 2025)</dc:relation>
<dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
<dc:rights>license:Creative commons</dc:rights>
<dc:rights>license uri:http://creativecommons.org/licenses/by-nc-nd/4.0/</dc:rights>
</oai_dc:dc></metadata></record></GetRecord></OAI-PMH>