Two culture-independent methods, amplicon-based sequencing and shotgun metagenomics, have significantly advanced the study of microbial communities. To date, short-read sequencing technologies have enabled high accuracy and deep coverage, while long-read sequencing approaches are increasingly being applied to improve genome assembly, despite challenges related to sequencing errors and nucleic acid input requirements. In this benchmark study, we compared the shotgun metagenomics approach across three sequencing technologies, Illumina (short reads), PacBio and Nanopore (long reads), using a 20-species commercial mock microbial community with even species representation. Specifically, we evaluated the effectiveness of the data generated by each platform in reconstructing genomes and identifying specific known taxa, as well as in understanding their functional potential, considering annotated genes, the length of predicted proteins and the number and types of inferred functions. Illumina sequencing provided high-throughput and high-quality data, but its limited read length precluded complete genome assembly. This affected the functional analysis, leading to an underestimation of coding and non-coding genes. Nanopore sequencing yielded the longest reads, resulting in more contiguous assemblies, although it was affected by higher error rates and the choice of assembly method. PacBio offered the best balance between read length and base accuracy, but with a lower number of reads. This affected genome coverage for certain taxa, influencing the quality of their assemblies, the completeness of MAGs (Metagenome Assembled Genomes), and the accuracy of functional annotation. Nevertheless, PacBio successfully retrieved MAGs for all mock community species, and the genome annotation was consistent with the reference. Evaluating the strengths and limitations of different NGS technologies and assembly strategies, this benchmark provides a practical framework for selecting the most suitable approach for optimizing data quality in microbiome genome characterization, according to study-specific goals.

Benchmarking short- and long-read sequencing technologies for metagenomic profiling of microbiomes

Grazia Visci
Co-primo
;
Elisabetta Notario
Co-primo
;
Mariano Francesco Caratozzolo;Bruno Fosso
;
Marinella Marzano
;
Graziano Pesole
2026

Abstract

Two culture-independent methods, amplicon-based sequencing and shotgun metagenomics, have significantly advanced the study of microbial communities. To date, short-read sequencing technologies have enabled high accuracy and deep coverage, while long-read sequencing approaches are increasingly being applied to improve genome assembly, despite challenges related to sequencing errors and nucleic acid input requirements. In this benchmark study, we compared the shotgun metagenomics approach across three sequencing technologies, Illumina (short reads), PacBio and Nanopore (long reads), using a 20-species commercial mock microbial community with even species representation. Specifically, we evaluated the effectiveness of the data generated by each platform in reconstructing genomes and identifying specific known taxa, as well as in understanding their functional potential, considering annotated genes, the length of predicted proteins and the number and types of inferred functions. Illumina sequencing provided high-throughput and high-quality data, but its limited read length precluded complete genome assembly. This affected the functional analysis, leading to an underestimation of coding and non-coding genes. Nanopore sequencing yielded the longest reads, resulting in more contiguous assemblies, although it was affected by higher error rates and the choice of assembly method. PacBio offered the best balance between read length and base accuracy, but with a lower number of reads. This affected genome coverage for certain taxa, influencing the quality of their assemblies, the completeness of MAGs (Metagenome Assembled Genomes), and the accuracy of functional annotation. Nevertheless, PacBio successfully retrieved MAGs for all mock community species, and the genome annotation was consistent with the reference. Evaluating the strengths and limitations of different NGS technologies and assembly strategies, this benchmark provides a practical framework for selecting the most suitable approach for optimizing data quality in microbiome genome characterization, according to study-specific goals.
2026
Istituto di Biomembrane, Bioenergetica e Biotecnologie Molecolari (IBIOM)
Shotgun metagenomics, microbiome, next-generation sequencing, third-generation sequencing, MAGs, functional analysis, mock community analysis.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/582666
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact