Motivation: Among the genomic variants of the human genome, Double Nucleotide Polymorphisms (DNPs) are still understudied. They consist of two adjacent variant nucleotides that arise from a single mutational event. Despite their potential relevance in the study of genetic variation, no method currently exists to directly and reliably call DNP genotypes at the individual level. Results: We present DNPcall, a new pipeline for accurately genotyping putative DNPs based on the pileup file obtained with samtools. DNPcall leverages the information about the read name to finely reconstruct the genotype of the DNPs at the individual level. The genotype is called when both positions of the DNPs are covered by the same read, ensuring that no spurious calls due to sequencing errors are included. In this way, DNPcall can also discriminate between DNPs arising by a single mutation and two adjacent SNPs. The latter ones will indeed result in spurious calls of the putative DNP because the two alternative variants are not linked. Availability and implementation: DNPcall is a user-friendly pipeline designed to enhance the study of genomic variation. It can also be adapted and implemented to study other kinds of Multi Nucleotide Variants (MNVs) or, in general, microhaplotypes. Source code and documentation are available at https://github.com/fravasini/DNPcall.
DNPcall: a new pipeline for accurate double nucleotide polymorphism calling
Pistacchia, Letizia;Bella, Elisa;D'Atanasio, Eugenia;Cruciani, Fulvio;
2025
Abstract
Motivation: Among the genomic variants of the human genome, Double Nucleotide Polymorphisms (DNPs) are still understudied. They consist of two adjacent variant nucleotides that arise from a single mutational event. Despite their potential relevance in the study of genetic variation, no method currently exists to directly and reliably call DNP genotypes at the individual level. Results: We present DNPcall, a new pipeline for accurately genotyping putative DNPs based on the pileup file obtained with samtools. DNPcall leverages the information about the read name to finely reconstruct the genotype of the DNPs at the individual level. The genotype is called when both positions of the DNPs are covered by the same read, ensuring that no spurious calls due to sequencing errors are included. In this way, DNPcall can also discriminate between DNPs arising by a single mutation and two adjacent SNPs. The latter ones will indeed result in spurious calls of the putative DNP because the two alternative variants are not linked. Availability and implementation: DNPcall is a user-friendly pipeline designed to enhance the study of genomic variation. It can also be adapted and implemented to study other kinds of Multi Nucleotide Variants (MNVs) or, in general, microhaplotypes. Source code and documentation are available at https://github.com/fravasini/DNPcall.| File | Dimensione | Formato | |
|---|---|---|---|
|
Pistacchia et al 2025.pdf
accesso aperto
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
602.31 kB
Formato
Adobe PDF
|
602.31 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


