We discuss Cross-Lingual Text Quantification (CLTQ), the task of performing text quantification (i.e., estimating the relative frequency pc(D) of all classes c?C in a set D of unlabelled documents) when training documents are available for a source language S but not for the target language T for which quantification needs to be performed. CLTQ has never been discussed before in the literature; we establish baseline results for the binary case by combining state-of-the-art quantification methods with methods capable of generating cross-lingual vectorial representations of the source and target documents involved. We present experimental results obtained on publicly available datasets for cross-lingual sentiment classification; the results show that the presented methods can perform CLTQ with a surprising level of accuracy.
Cross-Lingual Sentiment Quantification
Esuli A;Moreo Fernandez A D;Sebastiani F
2019
Abstract
We discuss Cross-Lingual Text Quantification (CLTQ), the task of performing text quantification (i.e., estimating the relative frequency pc(D) of all classes c?C in a set D of unlabelled documents) when training documents are available for a source language S but not for the target language T for which quantification needs to be performed. CLTQ has never been discussed before in the literature; we establish baseline results for the binary case by combining state-of-the-art quantification methods with methods capable of generating cross-lingual vectorial representations of the source and target documents involved. We present experimental results obtained on publicly available datasets for cross-lingual sentiment classification; the results show that the presented methods can perform CLTQ with a surprising level of accuracy.File | Dimensione | Formato | |
---|---|---|---|
prod_415585-doc_146372.pdf
accesso aperto
Descrizione: Cross-Lingual Sentiment Quantification
Dimensione
147.21 kB
Formato
Adobe PDF
|
147.21 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.