Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not beensystematically assessed because of the limitations of mapping short-read sequencingdata1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDsacross 102 human haplotypes and compared the pattern of SNVs between unique andduplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared tounique regions and estimate that at least 23% of this increase is due to interlocus geneconversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on averageper human haplotype. We develop a genome-wide map of IGC donors and acceptors,including 498 acceptor and 454 donor hotspots affecting the exons of about 800protein-coding genes. These include 171 genes that have 'relocated' on average1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework,we show that SD regions are slightly evolutionarily older when compared to uniquesequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutationalspectrum: a 27.1% increase in transversions that convert cytosine to guanine or thereverse across all triplet contexts and a 7.6% reduction in the frequency of CpGassociatedmutations when compared to unique DNA. We reason that these distinctmutational properties help to maintain an overall higher GC content of SD DNAcompared to that of unique DNA, probably driven by GC-biased conversion betweenparalogous sequences5,6.

Increased mutation and gene conversion within human segmental duplications

Buonaiuto S;Colonna V;
2023

Abstract

Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not beensystematically assessed because of the limitations of mapping short-read sequencingdata1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDsacross 102 human haplotypes and compared the pattern of SNVs between unique andduplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared tounique regions and estimate that at least 23% of this increase is due to interlocus geneconversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on averageper human haplotype. We develop a genome-wide map of IGC donors and acceptors,including 498 acceptor and 454 donor hotspots affecting the exons of about 800protein-coding genes. These include 171 genes that have 'relocated' on average1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework,we show that SD regions are slightly evolutionarily older when compared to uniquesequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutationalspectrum: a 27.1% increase in transversions that convert cytosine to guanine or thereverse across all triplet contexts and a 7.6% reduction in the frequency of CpGassociatedmutations when compared to unique DNA. We reason that these distinctmutational properties help to maintain an overall higher GC content of SD DNAcompared to that of unique DNA, probably driven by GC-biased conversion betweenparalogous sequences5,6.
2023
Istituto di genetica e biofisica "Adriano Buzzati Traverso"- IGB - Sede Napoli
Single-nucleotide variants (SNVs)
segmental duplications (SDs
File in questo prodotto:
File Dimensione Formato  
prod_481894-doc_198232 (1)_compressed.pdf

accesso aperto

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 4.43 MB
Formato Adobe PDF
4.43 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/460105
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 32
  • ???jsp.display-item.citation.isi??? 33
social impact