In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling.

Profiling-UD: a Tool for Linguistic Profiling of Texts

Dominique Brunato;Andrea Cimino;Felice Dell'Orletta;Simonetta Montemagni;Giulia Venturi
2020

Abstract

In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC en
dc.authority.people Dominique Brunato en
dc.authority.people Andrea Cimino en
dc.authority.people Felice Dell'Orletta en
dc.authority.people Simonetta Montemagni en
dc.authority.people Giulia Venturi en
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/18 22:43:19 -
dc.date.available 2024/02/18 22:43:19 -
dc.date.firstsubmission 2025/01/10 16:14:42 *
dc.date.issued 2020 -
dc.date.submission 2025/01/10 16:14:42 *
dc.description.abstracteng In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling. -
dc.description.affiliations Istituto di Linguistica Computazionale "A. Zampolli" (ILC-CNR) -
dc.description.allpeople Brunato, Dominique; Cimino, Andrea; Dell'Orletta, Felice; Montemagni, Simonetta; Venturi, Giulia -
dc.description.allpeopleoriginal Dominique Brunato, Andrea Cimino, Felice Dell'Orletta, Simonetta Montemagni, Giulia Venturi en
dc.description.fulltext open en
dc.description.numberofauthors 5 -
dc.identifier.isbn 979-10-95546-34-4 en
dc.identifier.uri https://hdl.handle.net/20.500.14243/384930 -
dc.identifier.url http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.883.pdf en
dc.language.iso eng en
dc.miur.last.status.update 2024-12-20T15:02:25Z *
dc.publisher.country FRA en
dc.publisher.name European Language Resources Association ELRA en
dc.publisher.place Paris en
dc.relation.conferencedate 11-16/05/2020 en
dc.relation.conferencename Conference on Language Resources and Evaluation (LREC) en
dc.relation.firstpage 7145 en
dc.relation.ispartofbook Proceedings of the 12th Language Resources and Evaluation Conference - LREC 2020 en
dc.relation.lastpage 7151 en
dc.relation.numberofpages 6 en
dc.subject.keywords Computational Language Variation Analysis -
dc.subject.keywords Linguistic Profiling -
dc.subject.keywords Universal Dependencies -
dc.subject.singlekeyword Computational Language Variation Analysis *
dc.subject.singlekeyword Linguistic Profiling *
dc.subject.singlekeyword Universal Dependencies *
dc.title Profiling-UD: a Tool for Linguistic Profiling of Texts en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato en
dc.ugov.descaux1 435966 -
iris.mediafilter.data 2025/04/13 03:28:16 *
iris.orcid.lastModifiedDate 2025/01/10 16:42:16 *
iris.orcid.lastModifiedMillisecond 1736523736872 *
iris.scopus.extIssued 2020 -
iris.scopus.extTitle Profiling-UD: A tool for linguistic profiling of texts -
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
2020.lrec-1.883.pdf

accesso aperto

Licenza: Creative commons
Dimensione 570.38 kB
Formato Adobe PDF
570.38 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/384930
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact