PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.

The PAISÀ Corpus of Italian Web Texts

Felice Dell'Orletta;Vito Pirrelli
2014

Abstract

PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di linguistica computazionale "Antonio Zampolli" - ILC -
dc.authority.people Verena Lyding it
dc.authority.people Egon Stemle it
dc.authority.people Claudia Borghetti it
dc.authority.people Marco Brunello it
dc.authority.people Sara Castagnoli it
dc.authority.people Felice Dell'Orletta it
dc.authority.people Henrik Dittmann it
dc.authority.people Alessandro Lenci it
dc.authority.people Vito Pirrelli it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.date.accessioned 2024/02/18 06:18:51 -
dc.date.available 2024/02/18 06:18:51 -
dc.date.issued 2014 -
dc.description.abstracteng PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation. -
dc.description.affiliations EURAC Research Bolzano, EURAC Research Bolzano, Università di Bologna, University of Leeds, Università di Bologna, ILC-CNR Pisa, Institut Jules Bordet, Università di Pisa, ILC-CNR Pisa -
dc.description.allpeople Lyding, Verena; Stemle, Egon; Borghetti, Claudia; Brunello, Marco; Castagnoli, Sara; Dell'Orletta, Felice; Dittmann, Henrik; Lenci, Alessandro; Pirrelli, Vito -
dc.description.allpeopleoriginal Verena Lyding, Egon Stemle, Claudia Borghetti, Marco Brunello, Sara Castagnoli, Felice Dell'Orletta, Henrik Dittmann, Alessandro Lenci, Vito Pirrelli -
dc.description.fulltext none en
dc.description.numberofauthors 9 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/261825 -
dc.identifier.url http://aclweb.org/anthology/W14-04 -
dc.language.iso eng -
dc.publisher.country USA -
dc.publisher.name Association for Computational Linguistics -
dc.publisher.place Stroudsburg -
dc.relation.alleditors Felix Bildhauer, Roland Schäfer -
dc.relation.conferencedate April 26, 2014 -
dc.relation.conferencename Corpus annotation, Tree-bank, Corpus design, Corpus harvesting -
dc.relation.conferenceplace Gothenburg. Sweden -
dc.relation.firstpage 36 -
dc.relation.ispartofbook Proceedings of the 9th Web as Corpus Workshop (WaC-9) -
dc.relation.lastpage 43 -
dc.relation.numberofpages 8 -
dc.title The PAISÀ Corpus of Italian Web Texts en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 289308 -
iris.orcid.lastModifiedDate 2024/04/04 13:37:26 *
iris.orcid.lastModifiedMillisecond 1712230646599 *
iris.scopus.extIssued 2014 -
iris.scopus.extTitle The PAISÀ Corpus of Italian Web Texts -
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/261825
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact