We tackle two different problems of text categorization (TC), namely feature selection and classifier induction. We propose a novel FS technique, based on a simplified version of the X 2 statistics and a novel variant, based on the exploitation of negative evidence, of the well-known k-NN method. We report the results of systematic experimentation of these two methods performed on the standard Reuters-21578 benchmark.

Feature selection and negative evidence in automated text categorization

Sebastiani F;
2000

Abstract

We tackle two different problems of text categorization (TC), namely feature selection and classifier induction. We propose a novel FS technique, based on a simplified version of the X 2 statistics and a novel variant, based on the exploitation of negative evidence, of the well-known k-NN method. We report the results of systematic experimentation of these two methods performed on the standard Reuters-21578 benchmark.
2000
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Text Mining
Information extraction
File in questo prodotto:
File Dimensione Formato  
prod_447231-doc_161134.pdf

non disponibili

Descrizione: Feature selection and negative evidence in automated text categorization
Tipologia: Versione Editoriale (PDF)
Dimensione 751.19 kB
Formato Adobe PDF
751.19 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/427377
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact