We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa.

Capturing the comparable : a system for querying comparable text corpora

Peters C;Picchi E
1995

Abstract

We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa.
Campo DC Valore Lingua
dc.authority.orgunit Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI -
dc.authority.people Peters C it
dc.authority.people Picchi E it
dc.collection.id.s 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d *
dc.collection.name 04.01 Contributo in Atti di convegno *
dc.contributor.appartenenza Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI *
dc.contributor.appartenenza Istituto di linguistica computazionale "Antonio Zampolli" - ILC *
dc.contributor.appartenenza.mi 918 *
dc.contributor.appartenenza.mi 973 *
dc.date.accessioned 2024/02/20 04:44:21 -
dc.date.available 2024/02/20 04:44:21 -
dc.date.issued 1995 -
dc.description.abstracteng We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa. -
dc.description.affiliations CNR-IEI, Pisa, Italy; CNR-ILC, Pisa, Italy -
dc.description.allpeople Peters C.; Picchi E. -
dc.description.allpeopleoriginal Peters C.; Picchi E. -
dc.description.fulltext restricted en
dc.description.note Codice PuMa: /cnr.iei/1995-A2-015 -
dc.description.numberofauthors 2 -
dc.identifier.uri https://hdl.handle.net/20.500.14243/390696 -
dc.language.iso eng -
dc.relation.conferencedate 11-13 dicembre 1995 -
dc.relation.conferencename 3. Giornate Internazionali di Analisi Statistica dei Dati Testuali = 3rd international conference of Statistical analysis on Textual data = 3. Journ -
dc.relation.conferenceplace Roma, Italy -
dc.relation.firstpage 247 -
dc.relation.lastpage 254 -
dc.relation.numberofpages 11 -
dc.subject.keywords Textual Databases -
dc.subject.keywords Bilingual Reference Corpora -
dc.subject.keywords Contrastive Textology -
dc.subject.keywords Database management -
dc.subject.keywords Information Search and Retrieval -
dc.subject.singlekeyword Textual Databases *
dc.subject.singlekeyword Bilingual Reference Corpora *
dc.subject.singlekeyword Contrastive Textology *
dc.subject.singlekeyword Database management *
dc.subject.singlekeyword Information Search and Retrieval *
dc.title Capturing the comparable : a system for querying comparable text corpora en
dc.type.driver info:eu-repo/semantics/conferenceObject -
dc.type.full 04 Contributo in convegno::04.01 Contributo in Atti di convegno it
dc.type.miur 273 -
dc.type.referee Sì, ma tipo non specificato -
dc.ugov.descaux1 409987 -
iris.mediafilter.data 2025/04/19 03:12:58 *
iris.orcid.lastModifiedDate 2024/03/02 05:09:57 *
iris.orcid.lastModifiedMillisecond 1709352597971 *
iris.sitodocente.maxattempts 1 -
Appare nelle tipologie: 04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_409987-doc_144254.pdf

solo utenti autorizzati

Descrizione: Capturing the comparable : a system for querying comparable text corpora
Tipologia: Versione Editoriale (PDF)
Dimensione 1.23 MB
Formato Adobe PDF
1.23 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/390696
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact