The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions.
Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks
Alessio Miaschi;Chiara Alzetta;Dominique Brunato;Felice Dell'Orletta;Giulia Venturi
2023
Abstract
The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.ancejournal | INFORMATION | en |
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Alessio Miaschi | en |
| dc.authority.people | Chiara Alzetta | en |
| dc.authority.people | Dominique Brunato | en |
| dc.authority.people | Felice Dell'Orletta | en |
| dc.authority.people | Giulia Venturi | en |
| dc.collection.id.s | b3f88f24-048a-4e43-8ab1-6697b90e068e | * |
| dc.collection.name | 01.01 Articolo in rivista | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.contributor.area | Non assegn | * |
| dc.date.accessioned | 2024/02/20 06:08:02 | - |
| dc.date.available | 2024/02/20 06:08:02 | - |
| dc.date.firstsubmission | 2025/03/03 15:10:56 | * |
| dc.date.issued | 2023 | - |
| dc.date.submission | 2025/03/04 11:39:17 | * |
| dc.description.abstracteng | The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT's layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model's predictions. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale Antonio Zampolli Consiglio Nazionale Delle Ricerche, Pisa, Italy | - |
| dc.description.allpeople | Miaschi, Alessio; Alzetta, Chiara; Brunato, Dominique; Dell'Orletta, Felice; Venturi, Giulia | - |
| dc.description.allpeopleoriginal | Alessio Miaschi, Chiara Alzetta, Dominique Brunato, Felice Dell'Orletta, Giulia Venturi | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.doi | 10.3390/info14030144 | en |
| dc.identifier.isi | WOS:000958210300001 | en |
| dc.identifier.scopus | 2-s2.0-85151097172 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/439018 | - |
| dc.identifier.url | https://www.mdpi.com/2078-2489/14/3/144 | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2025-03-03T14:11:39Z | * |
| dc.relation.issue | 3 | en |
| dc.relation.numberofpages | 19 | en |
| dc.relation.volume | 14 | en |
| dc.subject.keywords | Neural language model | - |
| dc.subject.keywords | Probing tasks | - |
| dc.subject.keywords | Treebanks | - |
| dc.subject.singlekeyword | Neural language model | * |
| dc.subject.singlekeyword | Probing tasks | * |
| dc.subject.singlekeyword | Treebanks | * |
| dc.title | Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks | en |
| dc.type.circulation | Internazionale | en |
| dc.type.driver | info:eu-repo/semantics/article | - |
| dc.type.full | 01 Contributo su Rivista::01.01 Articolo in rivista | it |
| dc.type.miur | 262 | - |
| dc.type.referee | Esperti anonimi | en |
| dc.ugov.descaux1 | 488203 | - |
| iris.isi.metadataErrorDescription | 0 | - |
| iris.isi.metadataErrorType | ERROR_NO_MATCH | - |
| iris.isi.metadataStatus | ERROR | - |
| iris.mediafilter.data | 2025/03/23 03:17:04 | * |
| iris.orcid.lastModifiedDate | 2025/03/05 11:09:58 | * |
| iris.orcid.lastModifiedMillisecond | 1741169398342 | * |
| iris.scopus.extIssued | 2023 | - |
| iris.scopus.extTitle | Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks | - |
| iris.sitodocente.maxattempts | 1 | - |
| iris.unpaywall.bestoahost | publisher | * |
| iris.unpaywall.bestoaversion | publishedVersion | * |
| iris.unpaywall.doi | 10.3390/info14030144 | * |
| iris.unpaywall.hosttype | publisher | * |
| iris.unpaywall.isoa | true | * |
| iris.unpaywall.journalisindoaj | true | * |
| iris.unpaywall.landingpage | https://doi.org/10.3390/info14030144 | * |
| iris.unpaywall.license | cc-by | * |
| iris.unpaywall.metadataCallLastModified | 02/05/2025 05:06:40 | - |
| iris.unpaywall.metadataCallLastModifiedMillisecond | 1746155200658 | - |
| iris.unpaywall.oastatus | gold | * |
| iris.unpaywall.pdfurl | https://www.mdpi.com/2078-2489/14/3/144/pdf?version=1677054922 | * |
| scopus.authority.ancejournal | INFORMATION###2078-2489 | * |
| scopus.category | 1710 | * |
| scopus.contributor.affiliation | ItaliaNLPLab | - |
| scopus.contributor.affiliation | ItaliaNLPLab | - |
| scopus.contributor.affiliation | ItaliaNLPLab | - |
| scopus.contributor.affiliation | ItaliaNLPLab | - |
| scopus.contributor.affiliation | ItaliaNLPLab | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.afid | 60021199 | - |
| scopus.contributor.auid | 57211678681 | - |
| scopus.contributor.auid | 57192938832 | - |
| scopus.contributor.auid | 55237740200 | - |
| scopus.contributor.auid | 57540567000 | - |
| scopus.contributor.auid | 27568199800 | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.country | Italy | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.dptid | 121833164 | - |
| scopus.contributor.name | Alessio | - |
| scopus.contributor.name | Chiara | - |
| scopus.contributor.name | Dominique | - |
| scopus.contributor.name | Felice | - |
| scopus.contributor.name | Giulia | - |
| scopus.contributor.subaffiliation | CNR—Institute for Computational Linguistics “A. Zampolli”; | - |
| scopus.contributor.subaffiliation | CNR—Institute for Computational Linguistics “A. Zampolli”; | - |
| scopus.contributor.subaffiliation | CNR—Institute for Computational Linguistics “A. Zampolli”; | - |
| scopus.contributor.subaffiliation | CNR—Institute for Computational Linguistics “A. Zampolli”; | - |
| scopus.contributor.subaffiliation | CNR—Institute for Computational Linguistics “A. Zampolli”; | - |
| scopus.contributor.surname | Miaschi | - |
| scopus.contributor.surname | Alzetta | - |
| scopus.contributor.surname | Brunato | - |
| scopus.contributor.surname | Dell’Orletta | - |
| scopus.contributor.surname | Venturi | - |
| scopus.date.issued | 2023 | * |
| scopus.description.abstracteng | The outstanding performance recently reached by neural language models (NLMs) across many natural language processing (NLP) tasks has steered the debate towards understanding whether NLMs implicitly learn linguistic competence. Probes, i.e., supervised models trained using NLM representations to predict linguistic properties, are frequently adopted to investigate this issue. However, it is still questioned if probing classification tasks really enable such investigation or if they simply hint at surface patterns in the data. This work contributes to this debate by presenting an approach to assessing the effectiveness of a suite of probing tasks aimed at testing the linguistic knowledge implicitly encoded by one of the most prominent NLMs, BERT. To this aim, we compared the performance of probes when predicting gold and automatically altered values of a set of linguistic features. Our experiments were performed on Italian and were evaluated across BERT’s layers and for sentences with different lengths. As a general result, we observed higher performance in the prediction of gold values, thus suggesting that the probing model is sensitive to the distortion of feature values. However, our experiments also showed that the length of a sentence is a highly influential factor that is able to confound the probing model’s predictions. | * |
| scopus.description.allpeopleoriginal | Miaschi A.; Alzetta C.; Brunato D.; Dell'Orletta F.; Venturi G. | * |
| scopus.differences | scopus.subject.keywords | * |
| scopus.differences | scopus.description.allpeopleoriginal | * |
| scopus.differences | scopus.description.abstracteng | * |
| scopus.document.type | ar | * |
| scopus.document.types | ar | * |
| scopus.identifier.doi | 10.3390/info14030144 | * |
| scopus.identifier.eissn | 2078-2489 | * |
| scopus.identifier.pui | 2022279559 | * |
| scopus.identifier.scopus | 2-s2.0-85151097172 | * |
| scopus.journal.sourceid | 21100223111 | * |
| scopus.language.iso | eng | * |
| scopus.publisher.name | MDPI | * |
| scopus.relation.article | 144 | * |
| scopus.relation.issue | 3 | * |
| scopus.relation.volume | 14 | * |
| scopus.subject.keywords | BERT; Italian language; neural language models; probing tasks; treebanks; | * |
| scopus.title | Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks | * |
| scopus.titleeng | Testing the Effectiveness of the Diagnostic Probing Paradigm on Italian Treebanks | * |
| Appare nelle tipologie: | 01.01 Articolo in rivista | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
information-14-00144.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
1.04 MB
Formato
Adobe PDF
|
1.04 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


