The development of high-throughput technology in genome sequencing provide a large amount of raw data to study the regulatory functions of transcription factors (TFs) on gene expression. It is possible to realize a classifier system in which the gene expression level, under a certain condition, is regarded as the response variable and features related to TFs are taken as predictive variables. In this paper we consider the families of Instance-Based (IB) classifiers, and in particular the Prototype exemplar learning classifier (PEL-C), because IB-classifiers can infer a mixture of representative instances, which can be used to discover the typical epigenetic patterns of transcription factors which explain the gene expression levels. We consider, as case study, the gene regulatory system in mouse embryonic stem cells (ESCs). Experimental results show IB-classifier systems can be effectively used for quantitative modelling of gene expression levels because more than 50% of variation in gene expression can be explained using binding signals of 12 TFs; moreover the PEL-C identifies nine typical patterns of transcription factors activation that provide new insights to understand the gene expression machinery of mouse ESCs.

Discovering Typical Transcription-Factors Patterns in Gene Expression Levels of Mouse Embryonic Stem Cells by Instance-Based Classifiers

Angelini Claudia
2013

Abstract

The development of high-throughput technology in genome sequencing provide a large amount of raw data to study the regulatory functions of transcription factors (TFs) on gene expression. It is possible to realize a classifier system in which the gene expression level, under a certain condition, is regarded as the response variable and features related to TFs are taken as predictive variables. In this paper we consider the families of Instance-Based (IB) classifiers, and in particular the Prototype exemplar learning classifier (PEL-C), because IB-classifiers can infer a mixture of representative instances, which can be used to discover the typical epigenetic patterns of transcription factors which explain the gene expression levels. We consider, as case study, the gene regulatory system in mouse embryonic stem cells (ESCs). Experimental results show IB-classifier systems can be effectively used for quantitative modelling of gene expression levels because more than 50% of variation in gene expression can be explained using binding signals of 12 TFs; moreover the PEL-C identifies nine typical patterns of transcription factors activation that provide new insights to understand the gene expression machinery of mouse ESCs.
2013
Istituto Applicazioni del Calcolo ''Mauro Picone''
978-3-642-41190-8
Knowledge Discovery
Instance-Based Learning
High-throughput Sequencing
ChIP-Seq
RNA-Seq
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/382430
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact