In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling.
Profiling-UD: a Tool for Linguistic Profiling of Texts
Dominique Brunato;Andrea Cimino;Felice Dell'Orletta;Simonetta Montemagni;Giulia Venturi
2020
Abstract
In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | en |
| dc.authority.people | Dominique Brunato | en |
| dc.authority.people | Andrea Cimino | en |
| dc.authority.people | Felice Dell'Orletta | en |
| dc.authority.people | Simonetta Montemagni | en |
| dc.authority.people | Giulia Venturi | en |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/18 22:43:19 | - |
| dc.date.available | 2024/02/18 22:43:19 | - |
| dc.date.firstsubmission | 2025/01/10 16:14:42 | * |
| dc.date.issued | 2020 | - |
| dc.date.submission | 2025/01/10 16:14:42 | * |
| dc.description.abstracteng | In this paper, we introduce Profiling-UD, a new text analysis tool inspired to the principles of linguistic profiling that can support language variation research from different perspectives. It allows the extraction of more than 130 features, spanning across different levels of linguistic description. Beyond the large number of features that can be monitored, a main novelty of Profiling-UD is that it has been specifically devised to be multilingual since it is based on the Universal Dependencies framework. In the second part of the paper, we demonstrate the effectiveness of these features in a number of theoretical and applicative studies in which they were successfully used for text and author profiling. | - |
| dc.description.affiliations | Istituto di Linguistica Computazionale "A. Zampolli" (ILC-CNR) | - |
| dc.description.allpeople | Brunato, Dominique; Cimino, Andrea; Dell'Orletta, Felice; Montemagni, Simonetta; Venturi, Giulia | - |
| dc.description.allpeopleoriginal | Dominique Brunato, Andrea Cimino, Felice Dell'Orletta, Simonetta Montemagni, Giulia Venturi | en |
| dc.description.fulltext | open | en |
| dc.description.numberofauthors | 5 | - |
| dc.identifier.isbn | 979-10-95546-34-4 | en |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/384930 | - |
| dc.identifier.url | http://www.lrec-conf.org/proceedings/lrec2020/pdf/2020.lrec-1.883.pdf | en |
| dc.language.iso | eng | en |
| dc.miur.last.status.update | 2024-12-20T15:02:25Z | * |
| dc.publisher.country | FRA | en |
| dc.publisher.name | European Language Resources Association ELRA | en |
| dc.publisher.place | Paris | en |
| dc.relation.conferencedate | 11-16/05/2020 | en |
| dc.relation.conferencename | Conference on Language Resources and Evaluation (LREC) | en |
| dc.relation.firstpage | 7145 | en |
| dc.relation.ispartofbook | Proceedings of the 12th Language Resources and Evaluation Conference - LREC 2020 | en |
| dc.relation.lastpage | 7151 | en |
| dc.relation.numberofpages | 6 | en |
| dc.subject.keywords | Computational Language Variation Analysis | - |
| dc.subject.keywords | Linguistic Profiling | - |
| dc.subject.keywords | Universal Dependencies | - |
| dc.subject.singlekeyword | Computational Language Variation Analysis | * |
| dc.subject.singlekeyword | Linguistic Profiling | * |
| dc.subject.singlekeyword | Universal Dependencies | * |
| dc.title | Profiling-UD: a Tool for Linguistic Profiling of Texts | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | en |
| dc.ugov.descaux1 | 435966 | - |
| iris.mediafilter.data | 2025/04/13 03:28:16 | * |
| iris.orcid.lastModifiedDate | 2025/01/10 16:42:16 | * |
| iris.orcid.lastModifiedMillisecond | 1736523736872 | * |
| iris.scopus.extIssued | 2020 | - |
| iris.scopus.extTitle | Profiling-UD: A tool for linguistic profiling of texts | - |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
2020.lrec-1.883.pdf
accesso aperto
Licenza:
Creative commons
Dimensione
570.38 kB
Formato
Adobe PDF
|
570.38 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


