PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
The PAISÀ Corpus of Italian Web Texts
Felice Dell'Orletta;Vito Pirrelli
2014
Abstract
PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | - |
| dc.authority.people | Verena Lyding | it |
| dc.authority.people | Egon Stemle | it |
| dc.authority.people | Claudia Borghetti | it |
| dc.authority.people | Marco Brunello | it |
| dc.authority.people | Sara Castagnoli | it |
| dc.authority.people | Felice Dell'Orletta | it |
| dc.authority.people | Henrik Dittmann | it |
| dc.authority.people | Alessandro Lenci | it |
| dc.authority.people | Vito Pirrelli | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.date.accessioned | 2024/02/18 06:18:51 | - |
| dc.date.available | 2024/02/18 06:18:51 | - |
| dc.date.issued | 2014 | - |
| dc.description.abstracteng | PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation. | - |
| dc.description.affiliations | EURAC Research Bolzano, EURAC Research Bolzano, Università di Bologna, University of Leeds, Università di Bologna, ILC-CNR Pisa, Institut Jules Bordet, Università di Pisa, ILC-CNR Pisa | - |
| dc.description.allpeople | Lyding, Verena; Stemle, Egon; Borghetti, Claudia; Brunello, Marco; Castagnoli, Sara; Dell'Orletta, Felice; Dittmann, Henrik; Lenci, Alessandro; Pirrelli, Vito | - |
| dc.description.allpeopleoriginal | Verena Lyding, Egon Stemle, Claudia Borghetti, Marco Brunello, Sara Castagnoli, Felice Dell'Orletta, Henrik Dittmann, Alessandro Lenci, Vito Pirrelli | - |
| dc.description.fulltext | none | en |
| dc.description.numberofauthors | 9 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/261825 | - |
| dc.identifier.url | http://aclweb.org/anthology/W14-04 | - |
| dc.language.iso | eng | - |
| dc.publisher.country | USA | - |
| dc.publisher.name | Association for Computational Linguistics | - |
| dc.publisher.place | Stroudsburg | - |
| dc.relation.alleditors | Felix Bildhauer, Roland Schäfer | - |
| dc.relation.conferencedate | April 26, 2014 | - |
| dc.relation.conferencename | Corpus annotation, Tree-bank, Corpus design, Corpus harvesting | - |
| dc.relation.conferenceplace | Gothenburg. Sweden | - |
| dc.relation.firstpage | 36 | - |
| dc.relation.ispartofbook | Proceedings of the 9th Web as Corpus Workshop (WaC-9) | - |
| dc.relation.lastpage | 43 | - |
| dc.relation.numberofpages | 8 | - |
| dc.title | The PAISÀ Corpus of Italian Web Texts | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 289308 | - |
| iris.orcid.lastModifiedDate | 2024/04/04 13:37:26 | * |
| iris.orcid.lastModifiedMillisecond | 1712230646599 | * |
| iris.scopus.extIssued | 2014 | - |
| iris.scopus.extTitle | The PAISÀ Corpus of Italian Web Texts | - |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


