Background: System toxicology aims at understanding the mechanisms used by biological systems to respond to toxicants. Such understanding can be leveraged to assess the risk of chemicals, drugs, and consumer products in living organisms. In system toxicology, machine learning techniques and methodologies are applied to develop prediction models for classification of toxicant exposure of biological systems. Gene expression data (RNA/DNA microarray) are often used to develop such prediction models. Results: The outcome of the present work is an experimental methodology to develop prediction models, based on robust gene signatures, for the classification of cigarette smoke exposure and cessation in humans. It is a result of the participation in the recent sbv IMPROVER SysTox Computational Challenge. By merging different gene selection techniques, we obtain robust gene signatures and we investigate prediction capabilities of different off-the-shelf machine learning techniques, such as deep learning, linear models and support vector machines. We also predict 6 novel genes in our signature, and firmly believe these genes could be further investigated as biomarkers for tobacco smoking exposure. Conclusions: The prosed methodology provides gene signatures that give rise to top-ranked performances in the prediction capabilities of the investigated classification methods, as well as new discoveries in genetic signatures for bio-markers of the smoke exposure of humans.

Ensemble of Rankers for Efficient Gene Signature extraction in Smoke Exposure Classification

Maurizio Giordano;Mario Rosario Guarracino
2018

Abstract

Background: System toxicology aims at understanding the mechanisms used by biological systems to respond to toxicants. Such understanding can be leveraged to assess the risk of chemicals, drugs, and consumer products in living organisms. In system toxicology, machine learning techniques and methodologies are applied to develop prediction models for classification of toxicant exposure of biological systems. Gene expression data (RNA/DNA microarray) are often used to develop such prediction models. Results: The outcome of the present work is an experimental methodology to develop prediction models, based on robust gene signatures, for the classification of cigarette smoke exposure and cessation in humans. It is a result of the participation in the recent sbv IMPROVER SysTox Computational Challenge. By merging different gene selection techniques, we obtain robust gene signatures and we investigate prediction capabilities of different off-the-shelf machine learning techniques, such as deep learning, linear models and support vector machines. We also predict 6 novel genes in our signature, and firmly believe these genes could be further investigated as biomarkers for tobacco smoking exposure. Conclusions: The prosed methodology provides gene signatures that give rise to top-ranked performances in the prediction capabilities of the investigated classification methods, as well as new discoveries in genetic signatures for bio-markers of the smoke exposure of humans.
2018
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
toxicology
gene signature
smoking
supervised learning
feature selection
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/342026
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact