CNR Institutional Research Information System

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

The ParlaMint corpora of parliamentary proceedings

Erjavec T;Ogrodniczuk M;Osenova P;Ljubesic N;Simov K;Pancur A;Rudolf M;Kopp M;Barkarson S;Steingrimsson S;Coltekin C;de Does J;Depuydt K;Agnoloni T;Venturi G;Perez MC;de Macedo LD;Navarretta C;Luxardo G;Coole M;Rayson P;Morkevicius V;Krilavicius T;Dargis R;Ring O;van Heusden R;Marx M;Fiser D

2023

Abstract

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project's GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2023
			
	Strutture organizzative
	
				Istituto di linguistica computazionale "Antonio Zampolli" - ILC
Istituto di Informatica Giuridica e Sistemi Giudiziari - IGSG
			
	Parole chiave
	
				Parlamentary proceedings
Linguistic annotation
Universal Dependencies
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/448001

Citazioni

ND

92

ND

social impact