Hybridization capture is an emerging method making use of short oligonucleotide baits to enrich DNA libraries for genomic fragments of specific organisms thus enabling detection of their presence in environmental samples. Although it offers a primer-­independent alternative to metabarcoding, little empirical work has been dedicated to characterizing the underlying biases and coupled implications for bio- logical interpretation. Moreover, few published bioinformatic pipelines are available for designing polynucleotide capture baits from a reference sequence collection. We designed RNA-­baits specifically targeting two chloroplast barcoding genes matK and rbcL to reveal the plant taxonomic diversity present in a given environmental sam- ple. Our approach leverages the sensitivity of hybridization capture and the capacity of high-­throughput DNA sequencing instruments. It builds on a new and universal method based on ancestral sequence reconstruction, ultimately limiting the num- ber of bait-­probes required and reducing experimental costs, while accessing high levels of taxonomic diversity. Our bait-­set selectively targets four main plant orders (Fagales, Pinales, Asterales, and Poales), representing ~18% of all described vascular plants. This is achieved through the use of only 4084 baits, each 80 nucleotides in length (80-­mer), capturing ~1.0–1.6 k nucleotide sequences from each taxon. Tests on mock communities revealed important factors influencing capture efficiency and relative abundance estimates, including GC-­content, the overall target length per taxa, and the bait density and mean number of mismatches to the bait sequence. Our results show that hybridization capture, like metabarcoding, requires caution when interpreting results quantitatively within (paleo)-­ecological studies. Biases detected in this work have the potential to be mitigated with bait designs that avoid extreme base compositional biases and balancing bait targets across taxa. However, we strongly recommend the use of mock communities and read simulations to quantify the ac- curacy of taxonomic representation when using new bait designs.

Enriching barcoding markers in environmental samples utilizing a phylogenetic probe design: Insights from mock communities

Marchesini A.
Formal Analysis
;
2024

Abstract

Hybridization capture is an emerging method making use of short oligonucleotide baits to enrich DNA libraries for genomic fragments of specific organisms thus enabling detection of their presence in environmental samples. Although it offers a primer-­independent alternative to metabarcoding, little empirical work has been dedicated to characterizing the underlying biases and coupled implications for bio- logical interpretation. Moreover, few published bioinformatic pipelines are available for designing polynucleotide capture baits from a reference sequence collection. We designed RNA-­baits specifically targeting two chloroplast barcoding genes matK and rbcL to reveal the plant taxonomic diversity present in a given environmental sam- ple. Our approach leverages the sensitivity of hybridization capture and the capacity of high-­throughput DNA sequencing instruments. It builds on a new and universal method based on ancestral sequence reconstruction, ultimately limiting the num- ber of bait-­probes required and reducing experimental costs, while accessing high levels of taxonomic diversity. Our bait-­set selectively targets four main plant orders (Fagales, Pinales, Asterales, and Poales), representing ~18% of all described vascular plants. This is achieved through the use of only 4084 baits, each 80 nucleotides in length (80-­mer), capturing ~1.0–1.6 k nucleotide sequences from each taxon. Tests on mock communities revealed important factors influencing capture efficiency and relative abundance estimates, including GC-­content, the overall target length per taxa, and the bait density and mean number of mismatches to the bait sequence. Our results show that hybridization capture, like metabarcoding, requires caution when interpreting results quantitatively within (paleo)-­ecological studies. Biases detected in this work have the potential to be mitigated with bait designs that avoid extreme base compositional biases and balancing bait targets across taxa. However, we strongly recommend the use of mock communities and read simulations to quantify the ac- curacy of taxonomic representation when using new bait designs.
2024
Istituto di Ricerca sugli Ecosistemi Terrestri - IRET
capture bias, DNA barcoding, hybridization capture, shotgun metagenomics, target capture, target enrichment
File in questo prodotto:
File Dimensione Formato  
Environmental DNA - 2024 - Nota - Enriching barcoding markers in environmental samples utilizing a phylogenetic probe.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 2.17 MB
Formato Adobe PDF
2.17 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/514873
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact