Myb proteins make up one of the largest families of transcription factors in plants. These proteins are characterised by having a conserved DNA-binding domain composed from one up to four repeat motifs of about 50 amino acids length, called R0RIR2R3. MYB proteins play an important role in the regulation of various metabolisms including morphogenesis, meristem formation, cell cycle and secondary metabolism. Here we developed a bioinformatic pipeline to classify putative MYB transcript genes using a wide set of plant EST sequences. As a case study, the Asteraceae ESTs were considered. First, we downloaded the complete dataset of the EST sequences stored in the Genbank database, then, Emboss packages were used to trim the polyA tail and clean the vector sequence possibly present in the dataset. The cleaned ESTs were clustered/assembled with the Cap3 program and we used the obtained contigs and singletons for the detection of the putative open reading frames. The obtained ORFs were analyzed with Hmmer program, a bioinfomatic tool that processes the sequences with hidden Markov models. Hmmer is an implementation of profile hidden Markov models for biological sequence analysis (Krog et. al.) . The sequences containing at least two myb domains in the same open reading frame were considered the most relevant ones. In order to find the putative cluster of the orthologues, we used the Clustalw program to align the MYB transcription factors belonging to Arabidopsis and Oryza and the ORF obtained as described above. Afterwards, we analyzed the result of alignment by means the Dendroscope program, in order to find some putative orthologous clusters.

A bioinformatic approach for the detection of putative Myb orthologues in large plant EST datasets

Catalano D;FinettiSialer MM;de Virgilio M;Blanco E;Sonnante Ga
2009

Abstract

Myb proteins make up one of the largest families of transcription factors in plants. These proteins are characterised by having a conserved DNA-binding domain composed from one up to four repeat motifs of about 50 amino acids length, called R0RIR2R3. MYB proteins play an important role in the regulation of various metabolisms including morphogenesis, meristem formation, cell cycle and secondary metabolism. Here we developed a bioinformatic pipeline to classify putative MYB transcript genes using a wide set of plant EST sequences. As a case study, the Asteraceae ESTs were considered. First, we downloaded the complete dataset of the EST sequences stored in the Genbank database, then, Emboss packages were used to trim the polyA tail and clean the vector sequence possibly present in the dataset. The cleaned ESTs were clustered/assembled with the Cap3 program and we used the obtained contigs and singletons for the detection of the putative open reading frames. The obtained ORFs were analyzed with Hmmer program, a bioinfomatic tool that processes the sequences with hidden Markov models. Hmmer is an implementation of profile hidden Markov models for biological sequence analysis (Krog et. al.) . The sequences containing at least two myb domains in the same open reading frame were considered the most relevant ones. In order to find the putative cluster of the orthologues, we used the Clustalw program to align the MYB transcription factors belonging to Arabidopsis and Oryza and the ORF obtained as described above. Afterwards, we analyzed the result of alignment by means the Dendroscope program, in order to find some putative orthologous clusters.
2009
Istituto di Bioscienze e Biorisorse
978-88-900622-9-2
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/108453
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact