The paper illustrates a novel methodology meeting a twofold goal, namely quantifying the reliability of automatically generated dependency relations without using gold data on the one hand, and identifying which are the linguistic constructions negatively affecting the parser performance on the other hand. These represent objectives typically investigated in different lines of research, with different methods and techniques. Our methodology, at the crossroads of these perspectives, allows not only to quantify the parsing reliability of individual dependency types but also to identify and weight the contextual properties making relation instances more or less difficult to parse. The proposed methodology was tested in two different and complementary experiments, aimed at assessing the degree of parsing difficulty across (a) different dependency relation types, and (b) different instances of the same relation. The results show that the proposed methodology is able to identify difficult-to-parse dependency relations without relying on gold data and by taking into account a variety of intertwined linguistic factors. These findings pave the way to novel applications of the methodology, both in the direction of defining new evaluation metrics based purely on automatically parsed data and towards the automatic creation of challenge sets.

Linguistically-driven Selection of Difficult-to-Parse Dependency Structures

Chiara Alzetta;Felice Dell'Orletta;Simonetta Montemagni;Giulia Venturi
2020

Abstract

The paper illustrates a novel methodology meeting a twofold goal, namely quantifying the reliability of automatically generated dependency relations without using gold data on the one hand, and identifying which are the linguistic constructions negatively affecting the parser performance on the other hand. These represent objectives typically investigated in different lines of research, with different methods and techniques. Our methodology, at the crossroads of these perspectives, allows not only to quantify the parsing reliability of individual dependency types but also to identify and weight the contextual properties making relation instances more or less difficult to parse. The proposed methodology was tested in two different and complementary experiments, aimed at assessing the degree of parsing difficulty across (a) different dependency relation types, and (b) different instances of the same relation. The results show that the proposed methodology is able to identify difficult-to-parse dependency relations without relying on gold data and by taking into account a variety of intertwined linguistic factors. These findings pave the way to novel applications of the methodology, both in the direction of defining new evaluation metrics based purely on automatically parsed data and towards the automatic creation of challenge sets.
Campo DC Valore Lingua
dc.authority.ancejournal IJCOL en
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Chiara Alzetta en
dc.authority.people Felice Dell'Orletta en
dc.authority.people Simonetta Montemagni en
dc.authority.people Giulia Venturi en
dc.collection.id.s b3f88f24-048a-4e43-8ab1-6697b90e068e *
dc.collection.name 01.01 Articolo in rivista *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.contributor.area Non assegn *
dc.date.accessioned 2024/02/19 11:53:33 -
dc.date.available 2024/02/19 11:53:33 -
dc.date.firstsubmission 2025/03/06 14:47:37 *
dc.date.issued 2020 -
dc.date.submission 2025/03/06 14:47:37 *
dc.description.abstracteng The paper illustrates a novel methodology meeting a twofold goal, namely quantifying the reliability of automatically generated dependency relations without using gold data on the one hand, and identifying which are the linguistic constructions negatively affecting the parser performance on the other hand. These represent objectives typically investigated in different lines of research, with different methods and techniques. Our methodology, at the crossroads of these perspectives, allows not only to quantify the parsing reliability of individual dependency types but also to identify and weight the contextual properties making relation instances more or less difficult to parse. The proposed methodology was tested in two different and complementary experiments, aimed at assessing the degree of parsing difficulty across (a) different dependency relation types, and (b) different instances of the same relation. The results show that the proposed methodology is able to identify difficult-to-parse dependency relations without relying on gold data and by taking into account a variety of intertwined linguistic factors. These findings pave the way to novel applications of the methodology, both in the direction of defining new evaluation metrics based purely on automatically parsed data and towards the automatic creation of challenge sets. -
dc.description.affiliations Istituto di Linguistica Computazionale (ILC-CNR) -
dc.description.allpeople Alzetta, Chiara; Dell'Orletta, Felice; Montemagni, Simonetta; Venturi, Giulia -
dc.description.allpeopleoriginal Chiara Alzetta, Felice Dell'Orletta, Simonetta Montemagni, Giulia Venturi en
dc.description.fulltext open en
dc.description.international no en
dc.description.numberofauthors 4 -
dc.identifier.doi 10.4000/ijcol.719 en
dc.identifier.scopus 2-s2.0-85121271583 en
dc.identifier.uri https://hdl.handle.net/20.500.14243/446043 -
dc.identifier.url https://journals.openedition.org/ijcol/719 en
dc.language.iso eng en
dc.miur.last.status.update 2026-03-02T16:48:16Z *
dc.relation.firstpage 37 en
dc.relation.issue 2 en
dc.relation.lastpage 60 en
dc.relation.medium ELETTRONICO en
dc.relation.numberofpages 23 en
dc.relation.volume 6 en
dc.subject.keywords Linguistic Complexity -
dc.subject.keywords Syntactic Parsing -
dc.subject.keywords Evaluation metrics -
dc.subject.singlekeyword Linguistic Complexity *
dc.subject.singlekeyword Syntactic Parsing *
dc.subject.singlekeyword Evaluation metrics *
dc.title Linguistically-driven Selection of Difficult-to-Parse Dependency Structures en
dc.type.circulation Internazionale en
dc.type.driver info:eu-repo/semantics/article -
dc.type.full 01 Contributo su Rivista::01.01 Articolo in rivista it
dc.type.impactfactor si en
dc.type.miur 262 -
dc.type.referee Esperti anonimi en
dc.ugov.descaux1 463828 -
dc.ugov.descaux2 IJCoL is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License -
iris.mediafilter.data 2026/03/04 02:52:25 *
iris.orcid.lastModifiedDate 2026/03/03 18:01:16 *
iris.orcid.lastModifiedMillisecond 1772557276450 *
iris.scopus.extIssued 2020 -
iris.scopus.extTitle Linguistically-driven Selection of Difficult-to-Parse Dependency Structures -
iris.sitodocente.maxattempts 1 -
iris.unpaywall.bestoahost publisher *
iris.unpaywall.bestoaversion publishedVersion *
iris.unpaywall.doi 10.4000/ijcol.719 *
iris.unpaywall.hosttype publisher *
iris.unpaywall.isoa true *
iris.unpaywall.journalisindoaj true *
iris.unpaywall.landingpage https://doi.org/10.4000/ijcol.719 *
iris.unpaywall.license cc-by-nc-nd *
iris.unpaywall.metadataCallLastModified 04/03/2026 04:32:17 -
iris.unpaywall.metadataCallLastModifiedMillisecond 1772595137189 -
iris.unpaywall.oastatus gold *
iris.unpaywall.pdfurl http://journals.openedition.org/ijcol/pdf/719 *
scopus.authority.ancejournal IJCOL###2499-4553 *
scopus.category 1709 *
scopus.category 3310 *
scopus.category 1703 *
scopus.category 1702 *
scopus.contributor.affiliation ItaliaNLP Lab -
scopus.contributor.affiliation ItaliaNLP Lab -
scopus.contributor.affiliation ItaliaNLP Lab -
scopus.contributor.affiliation ItaliaNLP Lab -
scopus.contributor.afid 60025153 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.afid 60008941 -
scopus.contributor.auid 57192938832 -
scopus.contributor.auid 15056781100 -
scopus.contributor.auid 57540567000 -
scopus.contributor.auid 27568199800 -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.country Italy -
scopus.contributor.dptid 126298383 -
scopus.contributor.dptid 114087935 -
scopus.contributor.dptid 114087935 -
scopus.contributor.dptid 114087935 -
scopus.contributor.name Chiara -
scopus.contributor.name Simonetta -
scopus.contributor.name Felice -
scopus.contributor.name Giulia -
scopus.contributor.subaffiliation DIBRIS;Università degli Studi di Genova;Istituto di Linguistica Computazionale “Antonio Zampolli”;CNR; -
scopus.contributor.subaffiliation Istituto di Linguistica Computazionale “Antonio Zampolli”;CNR; -
scopus.contributor.subaffiliation Istituto di Linguistica Computazionale “Antonio Zampolli”;CNR; -
scopus.contributor.subaffiliation Istituto di Linguistica Computazionale “Antonio Zampolli”;CNR; -
scopus.contributor.surname Alzetta -
scopus.contributor.surname Montemagni -
scopus.contributor.surname Dell’orletta -
scopus.contributor.surname Venturi -
scopus.date.issued 2020 *
scopus.description.abstracteng The paper illustrates a novel methodology meeting a twofold goal, namely quantifying the reliability of automatically generated dependency relations without using gold data on the one hand, and identifying which are the linguistic constructions negatively affecting the parser performance on the other hand. These represent objectives typically investigated in different lines of research, with different methods and techniques. Our methodology, at the crossroads of these perspectives, allows not only to quantify the parsing reliability of individual dependency types, but also to identify and weight the contextual properties making relation instances more or less difficult to parse. The proposed methodology was tested in two different and complementary experiments, aimed at assessing the degree of parsing difficulty across (a) different dependency relation types, and (b) different instances of the same relation. The results show that the proposed methodology is able to identify difficult-to-parse dependency relations without relying on gold data and by taking into account a variety of intertwined linguistic factors. These findings pave the way to novel applications of the methodology, both in the direction of defining new evaluation metrics based purely on automatically parsed data and towards the automatic creation of challenge sets. *
scopus.description.allpeopleoriginal Alzetta C.; Montemagni S.; Dell'orletta F.; Venturi G. *
scopus.differences scopus.description.allpeopleoriginal *
scopus.differences scopus.description.abstracteng *
scopus.document.type ar *
scopus.document.types ar *
scopus.identifier.doi 10.4000/ijcol.719 *
scopus.identifier.eissn 2499-4553 *
scopus.identifier.pui 2031365028 *
scopus.identifier.scopus 2-s2.0-85121271583 *
scopus.journal.sourceid 21101252471 *
scopus.language.iso eng *
scopus.publisher.name Accademia University Press *
scopus.relation.firstpage 37 *
scopus.relation.issue 2 *
scopus.relation.lastpage 60 *
scopus.relation.volume 6 *
scopus.title Linguistically-driven Selection of Difficult-to-Parse Dependency Structures *
scopus.titleeng Linguistically-driven Selection of Difficult-to-Parse Dependency Structures *
Appare nelle tipologie: 01.01 Articolo in rivista
File in questo prodotto:
File Dimensione Formato  
ijcol-719.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 976.31 kB
Formato Adobe PDF
976.31 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/446043
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact