Several recent works have shown that K-mer sequence representation of a DNA sequence can be used for classification or identification of nucleosome positioning related sequences. This representation can be computationally expensive when k grows, making the complexity in spaces of exponential dimension. This issue affects significantly the classification task computed by a general machine learning algorithm used for the purpose of sequence classification. In this paper, we investigate the advantage offered by the so-called Variable Ranking Feature Selection method to select the most informative K- mers associated to a set of DNA sequences, for the final purpose of nucleosome/linker classification by a deep learning network. Results computed on three public datasets show the effectiveness of the adopted feature selection method.

Variable ranking feature selection for the identification of nucleosome related sequences

Rizzo R;Fiannaca A;La Rosa M;Urso A
2018

Abstract

Several recent works have shown that K-mer sequence representation of a DNA sequence can be used for classification or identification of nucleosome positioning related sequences. This representation can be computationally expensive when k grows, making the complexity in spaces of exponential dimension. This issue affects significantly the classification task computed by a general machine learning algorithm used for the purpose of sequence classification. In this paper, we investigate the advantage offered by the so-called Variable Ranking Feature Selection method to select the most informative K- mers associated to a set of DNA sequences, for the final purpose of nucleosome/linker classification by a deep learning network. Results computed on three public datasets show the effectiveness of the adopted feature selection method.
2018
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
978-3-030-00062-2
Deep learning models
Feature selection
DNA sequences
Epigenomic
Nucleosomes
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/348753
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact