From a machine learning point of view to identify a subset of relevant features from a real data set can be useful to improve the results achieved by classification methods and to reduce their time and space complexity. To achieve this goal, feature selection methods are usually employed. These approaches assume that the data contains redundant or irrelevant attributes that can be eliminated. In this work we propose a novel feature selection technique that exploits Mutual Information and that is able to automatically estimates the number of dimensions to retain. The main advantages of this new approach are: the ability to automatically estimate the number of features to retain, and the possibility to rank the features to select from the most probable to the less probable. Experiments on standard real data sets and the comparison with state-of-the-art feature selection techniques confirms the high quality of our approach.

A novel mutual information-based feature selection algorithm

Cassara' P.;
2015

Abstract

From a machine learning point of view to identify a subset of relevant features from a real data set can be useful to improve the results achieved by classification methods and to reduce their time and space complexity. To achieve this goal, feature selection methods are usually employed. These approaches assume that the data contains redundant or irrelevant attributes that can be eliminated. In this work we propose a novel feature selection technique that exploits Mutual Information and that is able to automatically estimates the number of dimensions to retain. The main advantages of this new approach are: the ability to automatically estimate the number of features to retain, and the possibility to rank the features to select from the most probable to the less probable. Experiments on standard real data sets and the comparison with state-of-the-art feature selection techniques confirms the high quality of our approach.
2015
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Feature selection
Mutual information
Markov blanket
Cross-entropy
Numerical analysis
Optimization
Pattern recognition
File in questo prodotto:
File Dimensione Formato  
prod_308495-doc_87985.pdf

solo utenti autorizzati

Descrizione: A novel mutual information-based feature selection algorithm
Dimensione 317.71 kB
Formato Adobe PDF
317.71 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/274914
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact