Cross-Language Text Categorization (CLTC) aims at producing a classifier for a target language when the only available training examples belong to a different source language. Existing CLTC methods are usually affected by high computational costs, require external linguistic resources, or demand a considerable human annotation effort. This paper presents a simple, yet effective, CLTC method based on projecting features from both source and target languages into a common vector space, by using a computationally lightweight distributional correspondence profile with respect to a small set of pivot terms. Experiments on a popular sentiment classification dataset show that our method performs favorably to state-of-the-art methods, requiring a significantly reduced computational cost and minimal human intervention.

Distributional correspondence indexing for cross-language text categorization

Esuli A;Fernandez AM
2015

Abstract

Cross-Language Text Categorization (CLTC) aims at producing a classifier for a target language when the only available training examples belong to a different source language. Existing CLTC methods are usually affected by high computational costs, require external linguistic resources, or demand a considerable human annotation effort. This paper presents a simple, yet effective, CLTC method based on projecting features from both source and target languages into a common vector space, by using a computationally lightweight distributional correspondence profile with respect to a small set of pivot terms. Experiments on a popular sentiment classification dataset show that our method performs favorably to state-of-the-art methods, requiring a significantly reduced computational cost and minimal human intervention.
2015
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Inglese
Allan Hanbury, Gabriella Kazai, Andreas Rauber, Norbert Fuhr
ECIR 2015 - Advances in Information Retrieval. 37th European Conference on IR Research
9022
104
109
978-3-319-16353-6
http://link.springer.com/chapter/10.1007%2F978-3-319-16354-3_12
Sì, ma tipo non specificato
29 March - 2 April 2015
Vienna, Austria
Cross-Language Text Categorization
Distributional Semantics
Sentiment Analysis
2
restricted
Esuli, A; Fernandez, Am
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_329758-doc_101467.pdf

solo utenti autorizzati

Descrizione: Distributional Correspondence Indexing for Cross-Language Text Categorization
Tipologia: Versione Editoriale (PDF)
Dimensione 234.83 kB
Formato Adobe PDF
234.83 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/294391
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact