Artigo Acesso aberto Revisado por pares

Extremely low-coverage sequencing and imputation increases power for genome-wide association studies

2012; Nature Portfolio; Volume: 44; Issue: 6 Linguagem: Inglês

10.1038/ng.2283

ISSN

1546-1718

Autores

Bogdan Paşaniuc, Nadin Rohland, Paul J. McLaren, Kiran Garimella, Noah Zaitlen, Heng Li, Namrata Gupta, Benjamin M. Neale, Mark J. Daly, Pamela Sklar, Patrick F. Sullivan, Sarah E. Bergen, Jennifer L. Moran, Christina M. Hultman, Paul Lichtenstein, Patrik K. E. Magnusson, Shaun Purcell, David W. Haas, Liming Liang, Shamil R. Sunyaev, Hon‐Cheong So, Paul I. W. de Bakker, David Reich, Alkes L. Price,

Tópico(s)

Cancer-related molecular mechanisms research

Resumo

Bogdan Pasaniuc, David Reich, Alkes Price and colleagues report analyses considering the potential of genome-wide association studies (GWAS) based on extremely low-coverage sequence data sets combined with imputation using data sets from the 1000 Genomes Project. They show with simulations and real exome-sequencing data that low-coverage sequencing can increase power for GWAS relative to genotyping arrays. Genome-wide association studies (GWAS) have proven to be a powerful method to identify common genetic variants contributing to susceptibility to common diseases. Here, we show that extremely low-coverage sequencing (0.1–0.5×) captures almost as much of the common (>5%) and low-frequency (1–5%) variation across the genome as SNP arrays. As an empirical demonstration, we show that genome-wide SNP genotypes can be inferred at a mean r2 of 0.71 using off-target data (0.24× average coverage) in a whole-exome study of 909 samples. Using both simulated and real exome-sequencing data sets, we show that association statistics obtained using extremely low-coverage sequencing data attain similar P values at known associated variants as data from genotyping arrays, without an excess of false positives. Within the context of reductions in sample preparation and sequencing costs, funds invested in extremely low-coverage sequencing can yield several times the effective sample size of GWAS based on SNP array data and a commensurate increase in statistical power.

Referência(s)