In automated DNA sequencing, the final algorithmic phase, referred to as basecalling, consists of the translation of four time signals in the form of peak sequences (electropherogram) to the corresponding sequence of bases. The most popular basecaller, Phred, detects the peaks based on heuristics, and is very efficient when the peaks are well distinct and quite regular in spread, amplitude and spacing. Unfortunately, in the practice the data is subject to several degradations, particularly near the end of the sequence. The most frequent ones are peak superposition, peak merging and signal leakage, resulting in secondary peaks. In these conditions the experiment must be repeated and the human intervention is required. Recently, there have been attempts to provide methodological foundations to the problem and use statistical models to solve it. In this paper, we propose exploiting a priori information and Bayesian estimation to remove degradations and recover the signals in an impulsive form which makes the task of basecalling straightforward.

Joint correction of cross-talk and peak spreading in DNA electropherograms

Tonazzini A;
2006

Abstract

In automated DNA sequencing, the final algorithmic phase, referred to as basecalling, consists of the translation of four time signals in the form of peak sequences (electropherogram) to the corresponding sequence of bases. The most popular basecaller, Phred, detects the peaks based on heuristics, and is very efficient when the peaks are well distinct and quite regular in spread, amplitude and spacing. Unfortunately, in the practice the data is subject to several degradations, particularly near the end of the sequence. The most frequent ones are peak superposition, peak merging and signal leakage, resulting in secondary peaks. In these conditions the experiment must be repeated and the human intervention is required. Recently, there have been attempts to provide methodological foundations to the problem and use statistical models to solve it. In this paper, we propose exploiting a priori information and Bayesian estimation to remove degradations and recover the signals in an impulsive form which makes the task of basecalling straightforward.
2006
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
DNA sequencing
Basecalling
Blind source separa
Blind deconvolution
File in questo prodotto:
File Dimensione Formato  
prod_91338-doc_130300.pdf

solo utenti autorizzati

Descrizione: Joint correction of cross-talk and peak spreading in DNA electropherograms
Tipologia: Versione Editoriale (PDF)
Dimensione 82.81 kB
Formato Adobe PDF
82.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/62248
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact