The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions.

Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks

Alessio Miaschi;Chiara Alzetta;Dominique Brunato;Felice Dell'Orletta;Giulia Venturi
2023

Abstract

The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions.
Campo DC Valore Lingua
dc.authority.ancejournal INFORMATION en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Alessio Miaschi en
dc.authority.people Chiara Alzetta en
dc.authority.people Dominique Brunato en
dc.authority.people Felice Dell'Orletta en
dc.authority.people Giulia Venturi en
dc.collection.id.s b3f88f24-048a-4e43-8ab1-6697b90e068e *
dc.collection.name 01.01 Articolo in rivista *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2024/02/20 06:08:02 -
dc.date.available 2024/02/20 06:08:02 -
dc.date.firstsubmission 2025/03/03 15:10:56 *
dc.date.issued 2023 -
dc.date.submission 2025/03/04 11:39:17 *
dc.description.abstracteng The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions. -
dc.description.affiliations Istituto di Linguistica Computazionale Antonio Zampolli Consiglio Nazionale Delle Ricerche, Pisa, Italy -
dc.description.allpeople Miaschi, Alessio; Alzetta, Chiara; Brunato, Dominique; Dell'Orletta, Felice; Venturi, Giulia -
dc.description.allpeopleoriginal Alessio Miaschi, Chiara Alzetta, Dominique Brunato, Felice Dell'Orletta, Giulia Venturi en
dc.description.fulltext open en
dc.description.numberofauthors 5 -
dc.identifier.doi 10.3390/info14030144 en
dc.identifier.isi WOS:000958210300001 en
dc.identifier.scopus 2-s2.0-85151097172 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/439018 -
dc.identifier.url https://www.mdpi.com/2078-2489/14/3/144 en
dc.language.iso eng en
dc.miur.last.status.update 2025-03-03T14:11:39Z *
dc.relation.issue 3 en
dc.relation.numberofpages 19 en
dc.relation.volume 14 en
dc.subject.keywords Neural language model -
dc.subject.keywords Probing tasks -
dc.subject.keywords Treebanks -
dc.subject.singlekeyword Neural language model *
dc.subject.singlekeyword Probing tasks *
dc.subject.singlekeyword Treebanks *
dc.title Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks en
dc.type.circulation Internazionale en
dc.type.driver info:eu-repo/semantics/article -
dc.type.full 01 Contributo su Rivista::01.01 Articolo in rivista it
dc.type.miur 262 -
dc.type.referee Esperti anonimi en
dc.ugov.descaux1 488203 -
iris.isi.metadataErrorDescription 0 -
iris.isi.metadataErrorType ERROR_NO_MATCH -
iris.isi.metadataStatus ERROR -
iris.mediafilter.data 2025/03/23 03:17:04 *
iris.orcid.lastModifiedDate 2025/03/05 11:09:58 *
iris.orcid.lastModifiedMillisecond 1741169398342 *
iris.scopus.extIssued 2023 -
iris.scopus.extTitle Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoahost publisher *
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.3390/info14030144 *
iris.unpaywall.hosttype publisher *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj true *
iris.unpaywall.landingpage https://doi.org/10.3390/info14030144 *
iris.unpaywall.license cc-by *
iris.unpaywall.metadataCallLastModified 02/05/2025 05:06:40 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1746155200658 -
iris.unpaywall.oastatus gold *
iris.unpaywall.pdfurl https://www.mdpi.com/2078-2489/14/3/144/pdf?version=1677054922 *
scopus.authority.ancejournal INFORMATION###2078-2489 *
scopus.category 1710 *
scopus.contributor.affiliation ItaliaNLPLab -
scopus.contributor.affiliation ItaliaNLPLab -
scopus.contributor.affiliation ItaliaNLPLab -
scopus.contributor.affiliation ItaliaNLPLab -
scopus.contributor.affiliation ItaliaNLPLab -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.afid 60021199 -
scopus.contributor.auid 57211678681 -
scopus.contributor.auid 57192938832 -
scopus.contributor.auid 55237740200 -
scopus.contributor.auid 57540567000 -
scopus.contributor.auid 27568199800 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid 121833164 -
scopus.contributor.dptid 121833164 -
scopus.contributor.name Alessio -
scopus.contributor.name Chiara -
scopus.contributor.name Dominique -
scopus.contributor.name Felice -
scopus.contributor.name Giulia -
scopus.contributor.subaffiliation CNR—Institute for Computational Linguistics “A. Zampolli”; -
scopus.contributor.subaffiliation CNR—Institute for Computational Linguistics “A. Zampolli”; -
scopus.contributor.subaffiliation CNR—Institute for Computational Linguistics “A. Zampolli”; -
scopus.contributor.subaffiliation CNR—Institute for Computational Linguistics “A. Zampolli”; -
scopus.contributor.subaffiliation CNR—Institute for Computational Linguistics “A. Zampolli”; -
scopus.contributor.surname Miaschi -
scopus.contributor.surname Alzetta -
scopus.contributor.surname Brunato -
scopus.contributor.surname Dell’Orletta -
scopus.contributor.surname Venturi -
scopus.date.issued 2023 *
scopus.description.abstracteng The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT’s layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model’s predictions. *
scopus.description.allpeopleoriginal Miaschi A.; Alzetta C.; Brunato D.; Dell'Orletta F.; Venturi G. *
scopus.differences scopus.subject.keywords *
scopus.differences scopus.description.allpeopleoriginal *
scopus.differences scopus.description.abstracteng *
scopus.document.type ar *
scopus.document.types ar *
scopus.identifier.doi 10.3390/info14030144 *
scopus.identifier.eissn 2078-2489 *
scopus.identifier.pui 2022279559 *
scopus.identifier.scopus 2-s2.0-85151097172 *
scopus.journal.sourceid 21100223111 *
scopus.language.iso eng *
scopus.publisher.name MDPI *
scopus.relation.article 144 *
scopus.relation.issue 3 *
scopus.relation.volume 14 *
scopus.subject.keywords BERT; Italian language; neural language models; probing tasks; treebanks; *
scopus.title Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks *
scopus.titleeng Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks *
Appare nelle tipologie: 01.01 Articolo in rivista
File in questo prodotto:
File Dimensione Formato  
information-14-00144.pdf

accesso aperto

Licenza: Creative commons
Dimensione 1.04 MB
Formato Adobe PDF
1.04 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/439018
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact