We present a bilingual corpus management system under development in Pisa. The first component of this system was a set of procedures to create and query parallel text archives; we are now studying the implementation of a second set of procedures to interrogate comparable archives. The approach followed is quite different from that used for parallel data and considerably more complex; the results are also very different. In the paper, we describe the strategy we are adopting to retrieve significant data from comparable corpora, and discuss the preliminary results.
From parallel to comparable text corpora
Peters C;
1996
Abstract
We present a bilingual corpus management system under development in Pisa. The first component of this system was a set of procedures to create and query parallel text archives; we are now studying the implementation of a second set of procedures to interrogate comparable archives. The approach followed is quite different from that used for parallel data and considerably more complex; the results are also very different. In the paper, we describe the strategy we are adopting to retrieve significant data from comparable corpora, and discuss the preliminary results.File in questo prodotto:
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_412411-doc_145172.pdf
solo utenti autorizzati
Descrizione: From parallel to comparable text corpora
Tipologia:
Versione Editoriale (PDF)
Dimensione
533.86 kB
Formato
Adobe PDF
|
533.86 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


