The identification of the genes that are coordinately regulated is an important and challenging task of bioinformatics and represents a first step in the elucidation of the topology of transcriptional networks. We first compare the performances, in a grid setting, of the Markov clustering algorithm with respect to the k-means using microarray test data sets. The gene expression information of the clustered genes can be used to annotate transcription binding sites upstream co-regulated genes. The methodology uses a regression model that relates gene expression levels to the matching scores of nucleotide patterns allowing us to identify DNA-binding sites from a collection of noncoding DNA sequences from co-regulated genes. Here we discuss extending the approach to multiple species exploiting the grid framework.
Grid methodology for identifying co-regulated genes and transcription factor binding sites
Milanesi Luciano;
2007
Abstract
The identification of the genes that are coordinately regulated is an important and challenging task of bioinformatics and represents a first step in the elucidation of the topology of transcriptional networks. We first compare the performances, in a grid setting, of the Markov clustering algorithm with respect to the k-means using microarray test data sets. The gene expression information of the clustered genes can be used to annotate transcription binding sites upstream co-regulated genes. The methodology uses a regression model that relates gene expression levels to the matching scores of nucleotide patterns allowing us to identify DNA-binding sites from a collection of noncoding DNA sequences from co-regulated genes. Here we discuss extending the approach to multiple species exploiting the grid framework.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.