Large sequencing studies based on next generation technologies profoundly improved the resolution of genome wide association studies. Although such an approach allowed for the detection of a large number of rare variants, enough statistical power is needed to successfully identify their associations with strong statistical support. Families and founder populations, where variants rare elsewhere can occur at moderate frequencies, provide an alternative to large meta-analyses and help overcome these limitations. Here we performed a large whole genome sequencing study of 3,514 individuals from the Sardinian population, whose demographic history provides a unique opportunity to study the effect of variants enriched due to isolation or selection. We detected >23M single nucleotide polymorphisms (SNPs) and generated a reference panel for imputation in 6,602 individuals genotyped at ~900K SNPs. This approach allowed us to reach extremely high imputation accuracy even for low frequency variants (r2 with directly measured genotypes=0.90 for variants with MAF 0.5-5%). The effects of isolation are clear from the extent of genetic differentiation with mainland Europeans (allele sharing ratio <0.6 for MAF 1-5%), and in the significantly higher deleteriousness of variants enriched in Sardinians (p=0.02). We then performed GWAS scans on several traits: height, 4 lipid levels (LDL, HDL, TG and TC), 5 inflammatory markers (ESR, hsCRP, ADPN, MCP-1, IL-6) and 3 hemoglobin levels (HbA1, HbA2, HbF). Overall, we identified 58 independently associated variants including 18 variants not previously described in prior GWAS. The advantages of analyzing this founder population are particularly evident in two scenarios: a) Signals with strong effect and that are extremely rare in Europe (MAF<0.01%) but enriched in Sardinia (MAF-0.5-5%) such as APOA5 associated with TG, GHR associated with height and a long stretch of variants on chromosome 12 region associated with hsCRP and ESR; b) signals rare in Europe (MAF <1%) and common in Sardinia (MAF >5%) such as CCDN3 associated with HbA2, KCNQ1 associated with height. Overall, these results demonstrate the benefits of our sequencing-based approach for the discovery of rare variants with strong effects enriched in the informative population of Sardinia.

Whole genome sequencing increase the power to detect trait-associated rare variants shifted towards high frequencies in the Sardinian island population.

Carlo Sidore;Magdalena Zoledziewska;Fabio Busonero;Andrea Maschio;Eleonora Porcu;Antonella Mulas;Giorgio Pistis;Maristella Steri;Silvia Naitza;Maristella Pitzalis;Andrea Angius;Serena Sanna;Francesco Cucca
2015

Abstract

Large sequencing studies based on next generation technologies profoundly improved the resolution of genome wide association studies. Although such an approach allowed for the detection of a large number of rare variants, enough statistical power is needed to successfully identify their associations with strong statistical support. Families and founder populations, where variants rare elsewhere can occur at moderate frequencies, provide an alternative to large meta-analyses and help overcome these limitations. Here we performed a large whole genome sequencing study of 3,514 individuals from the Sardinian population, whose demographic history provides a unique opportunity to study the effect of variants enriched due to isolation or selection. We detected >23M single nucleotide polymorphisms (SNPs) and generated a reference panel for imputation in 6,602 individuals genotyped at ~900K SNPs. This approach allowed us to reach extremely high imputation accuracy even for low frequency variants (r2 with directly measured genotypes=0.90 for variants with MAF 0.5-5%). The effects of isolation are clear from the extent of genetic differentiation with mainland Europeans (allele sharing ratio <0.6 for MAF 1-5%), and in the significantly higher deleteriousness of variants enriched in Sardinians (p=0.02). We then performed GWAS scans on several traits: height, 4 lipid levels (LDL, HDL, TG and TC), 5 inflammatory markers (ESR, hsCRP, ADPN, MCP-1, IL-6) and 3 hemoglobin levels (HbA1, HbA2, HbF). Overall, we identified 58 independently associated variants including 18 variants not previously described in prior GWAS. The advantages of analyzing this founder population are particularly evident in two scenarios: a) Signals with strong effect and that are extremely rare in Europe (MAF<0.01%) but enriched in Sardinia (MAF-0.5-5%) such as APOA5 associated with TG, GHR associated with height and a long stretch of variants on chromosome 12 region associated with hsCRP and ESR; b) signals rare in Europe (MAF <1%) and common in Sardinia (MAF >5%) such as CCDN3 associated with HbA2, KCNQ1 associated with height. Overall, these results demonstrate the benefits of our sequencing-based approach for the discovery of rare variants with strong effects enriched in the informative population of Sardinia.
2015
Istituto di Ricerca Genetica e Biomedica - IRGB
Sequenziamento Sardegna
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/275616
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact