In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called "housekeeping genes". The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and presents some singular cases potentially due to incorrect classification or erroneous annotations in the database

Soft Topographic Map for Clustering and Classification of Bacteria

Giuseppe Di Fatta;Salvatore Gaglio;Riccardo Rizzo;Alfonso Urso;Massimo La Rosa;
2007

Abstract

In this work a new method for clustering and building a topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called "housekeeping genes". The proposed method generates topographic maps of the bacteria taxonomy, where relations among different type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and presents some singular cases potentially due to incorrect classification or erroneous annotations in the database
2007
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
Inglese
Michael R. Berthold, John Shawe-Taylor, Nada Lavra?
Advances in Intelligent Data Analysis VII
7th International Symposium on Intelligent Data Analysis, IDA 07
332
343
11
978-3-540-74824-3
http://link.springer.com/chapter/10.1007%2F978-3-540-74825-0_30
Springer
London
REGNO UNITO DI GRAN BRETAGNA
Sì, ma tipo non specificato
6 - 8 September 2007
Ljubljana, Slovenia
Genomic Sequence Clustering
3
none
Giuseppe Di Fatta; Salvatore Gaglio; Riccardo Rizzo; Alfonso Urso; Massimo La Rosa; Giovanni Giammanco
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/13020
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact