We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa.
Capturing the comparable : a system for querying comparable text corpora
Peters C;Picchi E
1995
Abstract
We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa.| Campo DC | Valore | Lingua |
|---|---|---|
| dc.authority.orgunit | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | - |
| dc.authority.people | Peters C | it |
| dc.authority.people | Picchi E | it |
| dc.collection.id.s | 71c7200a-7c5f-4e83-8d57-d3d2ba88f40d | * |
| dc.collection.name | 04.01 Contributo in Atti di convegno | * |
| dc.contributor.appartenenza | Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI | * |
| dc.contributor.appartenenza | Istituto di linguistica computazionale "Antonio Zampolli" - ILC | * |
| dc.contributor.appartenenza.mi | 918 | * |
| dc.contributor.appartenenza.mi | 973 | * |
| dc.date.accessioned | 2024/02/20 04:44:21 | - |
| dc.date.available | 2024/02/20 04:44:21 | - |
| dc.date.issued | 1995 | - |
| dc.description.abstracteng | We discuss the importance of bilingual and multilingual text corpora in many types of cross language investigations and illustrate the differences between parallel and comparable text archives. The advantages of comparable over parallel data for certain kinds of contrastive linguistic studies arc outlined. A prototype version of a system for querying comparable text archives is then described and examples nf the first results arc given. The system will form part of an integrated works talion for mono- and bilingual lexical and text database management and interrogation under development at the Istituto di Linguistica Computazionole, Pisa. | - |
| dc.description.affiliations | CNR-IEI, Pisa, Italy; CNR-ILC, Pisa, Italy | - |
| dc.description.allpeople | Peters C.; Picchi E. | - |
| dc.description.allpeopleoriginal | Peters C.; Picchi E. | - |
| dc.description.fulltext | restricted | en |
| dc.description.note | Codice PuMa: /cnr.iei/1995-A2-015 | - |
| dc.description.numberofauthors | 2 | - |
| dc.identifier.uri | https://hdl.handle.net/20.500.14243/390696 | - |
| dc.language.iso | eng | - |
| dc.relation.conferencedate | 11-13 dicembre 1995 | - |
| dc.relation.conferencename | 3. Giornate Internazionali di Analisi Statistica dei Dati Testuali = 3rd international conference of Statistical analysis on Textual data = 3. Journ | - |
| dc.relation.conferenceplace | Roma, Italy | - |
| dc.relation.firstpage | 247 | - |
| dc.relation.lastpage | 254 | - |
| dc.relation.numberofpages | 11 | - |
| dc.subject.keywords | Textual Databases | - |
| dc.subject.keywords | Bilingual Reference Corpora | - |
| dc.subject.keywords | Contrastive Textology | - |
| dc.subject.keywords | Database management | - |
| dc.subject.keywords | Information Search and Retrieval | - |
| dc.subject.singlekeyword | Textual Databases | * |
| dc.subject.singlekeyword | Bilingual Reference Corpora | * |
| dc.subject.singlekeyword | Contrastive Textology | * |
| dc.subject.singlekeyword | Database management | * |
| dc.subject.singlekeyword | Information Search and Retrieval | * |
| dc.title | Capturing the comparable : a system for querying comparable text corpora | en |
| dc.type.driver | info:eu-repo/semantics/conferenceObject | - |
| dc.type.full | 04 Contributo in convegno::04.01 Contributo in Atti di convegno | it |
| dc.type.miur | 273 | - |
| dc.type.referee | Sì, ma tipo non specificato | - |
| dc.ugov.descaux1 | 409987 | - |
| iris.mediafilter.data | 2025/04/19 03:12:58 | * |
| iris.orcid.lastModifiedDate | 2024/03/02 05:09:57 | * |
| iris.orcid.lastModifiedMillisecond | 1709352597971 | * |
| iris.sitodocente.maxattempts | 1 | - |
| Appare nelle tipologie: | 04.01 Contributo in Atti di convegno | |
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_409987-doc_144254.pdf
solo utenti autorizzati
Descrizione: Capturing the comparable : a system for querying comparable text corpora
Tipologia:
Versione Editoriale (PDF)
Dimensione
1.23 MB
Formato
Adobe PDF
|
1.23 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


