ACKR1 Alleles at 5.6 kb in a Well-Characterized Renewable US Food and Drug Administration (FDA) Reference Panel for Standardization of Blood Group Genotyping
2020; Elsevier BV; Volume: 22; Issue: 10 Linguagem: Inglês
10.1016/j.jmoldx.2020.06.014
ISSN1943-7811
AutoresKshitij Srivastava, Pavel P. Khil, Emília Sippert, Evgeniya Volkova, John P. Dekker, María Rios, Willy A. Flegel,
Tópico(s)Complement system in diseases
ResumoThe glycoprotein encoded by the ACKR1 gene expresses the Duffy blood group antigens and is a receptor for malaria parasites. We recently described 18 long-range ACKR1 alleles in an autochthonous population of a malaria endemic region. Extending this work, we sequenced the gene in a 53-sample repository established by the US Food and Drug Administration (FDA) as reference reagents for blood group genotyping. The FDA samples have been characterized for 19 genes; however, long-range haplotype information for these genes, including ACKR1, was lacking. We used a hybrid approach, novel for this type of gene, to characterize ACKR1 by combining two next-generation sequencing technologies, the short-read massively parallel sequencing and the long-read nanopore sequencing. The expedient integration of data from both next-generation sequencing systems were necessary and sufficient to allow determination of all 25 long-range ACKR1 alleles found in the 53 samples accurately. All 25 alleles identified in our current FDA cohort were novel and, unexpectedly, none had been observed among the 18 alleles in our previous study. The alleles will be useful for validation, calibration, and proficiency testing of red cell genotyping. The lack of any overlap between the ACKR1 alleles in the two studies documents differences in mutation rate and recombination frequency among populations. The exact haplotype and their interethnic or interpopulation dissimilarities can influence disease susceptibility and therapy. The glycoprotein encoded by the ACKR1 gene expresses the Duffy blood group antigens and is a receptor for malaria parasites. We recently described 18 long-range ACKR1 alleles in an autochthonous population of a malaria endemic region. Extending this work, we sequenced the gene in a 53-sample repository established by the US Food and Drug Administration (FDA) as reference reagents for blood group genotyping. The FDA samples have been characterized for 19 genes; however, long-range haplotype information for these genes, including ACKR1, was lacking. We used a hybrid approach, novel for this type of gene, to characterize ACKR1 by combining two next-generation sequencing technologies, the short-read massively parallel sequencing and the long-read nanopore sequencing. The expedient integration of data from both next-generation sequencing systems were necessary and sufficient to allow determination of all 25 long-range ACKR1 alleles found in the 53 samples accurately. All 25 alleles identified in our current FDA cohort were novel and, unexpectedly, none had been observed among the 18 alleles in our previous study. The alleles will be useful for validation, calibration, and proficiency testing of red cell genotyping. The lack of any overlap between the ACKR1 alleles in the two studies documents differences in mutation rate and recombination frequency among populations. The exact haplotype and their interethnic or interpopulation dissimilarities can influence disease susceptibility and therapy. Use of lymphoblastoid cell lines as an unlimited renewable DNA source for external quality assessment schemes has been proposed since 20011Carl B. Kroll H. Bux J. Bein G. Santoso S. B-lymphoblastoid cell lines as a source of reference DNA for human platelet and neutrophil antigen genotyping.Transfusion. 2000; 40: 62-68Crossref PubMed Scopus (24) Google Scholar and has been implemented sporadically for red cell, human platelet, and neutrophil antigen genotyping.2Flegel W.A. Chiosea I. Sachs U.J. Bein G. External quality assessment in molecular immunohematology: the INSTAND proficiency test program.Transfusion. 2013; 53: 2850-2858PubMed Google Scholar The US Food and Drug Administration (FDA) recently developed a panel of 18 DNA reference reagents from Epstein-Barr virus–transformed cell lines, which can be used as a validated reference for standardization of blood group genotyping, and these reagents were added to the existing collection of 4 World Health Organization International Reference Reagents for blood group genotyping.3Volkova E. Sippert E. Liu M. Mercado T. Denomme G.A. Illoh O. Liu Z. Rios M. Validated reference panel from renewable source of genomic DNA available for standardization of blood group genotyping.J Mol Diagn. 2019; 21: 525-537Abstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar, 4Boyle J. Thorpe S.J. Hawkins J.R. Lockie C. Fox B. Matejtschuk P. Halls C. Metcalfe P. Rigsby P. Armstrong-Fisher S. Varzi A.M. Urbaniak S. Daniels G. International reference reagents to standardise blood group genotyping: evaluation of candidate preparations in an international collaborative study.Vox Sang. 2013; 104: 144-152Crossref PubMed Scopus (14) Google Scholar, 5Kroll H. Carl B. Santoso S. Bux J. Bein G. Workshop report on the genotyping of blood cell alloantigens.Transfus Med. 2001; 11: 211-219Crossref PubMed Scopus (23) Google Scholar This FDA reference panel is designed to comprise the least number of samples representing the greatest number of genotypes for use as controls for prediction of blood group antigens. In 2018, these 18 reagents were tested in an international collaborative validation study designed to determine genotypes of predefined genomic loci associated with distinct red cell antigens, thus providing limited sequence information.3Volkova E. Sippert E. Liu M. Mercado T. Denomme G.A. Illoh O. Liu Z. Rios M. Validated reference panel from renewable source of genomic DNA available for standardization of blood group genotyping.J Mol Diagn. 2019; 21: 525-537Abstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar For example, in the Duffy blood group system, the antigens are predicted based on genotyping only three positions of the ACKR1 gene, denoted as c.−67T>C, c.125G>A, and c.265C>T. Since 1992,6Ugozzoli L. Wallace R.B. Application of an allele-specific polymerase chain reaction to the direct determination of ABO blood group genotypes.Genomics. 1992; 12: 670-674Crossref PubMed Scopus (65) Google Scholar various studies have been published using red cell genotypes for extended blood group typing,7St-Louis M. Molecular blood grouping of donors.Transfus Apher Sci. 2014; 50: 175-182Abstract Full Text Full Text PDF PubMed Scopus (13) Google Scholar and since 20108Stabentheiner S. Danzer M. Niklas N. Atzmuller S. Proll J. Hackl C. Polin H. Hofer K. Gabriel C. Overcoming methodical limits of standard RHD genotyping by next-generation sequencing.Vox Sang. 2011; 100: 381-388Crossref PubMed Scopus (36) Google Scholar also using next-generation sequencing (NGS) chemistries.9Orzinska A. Guz K. Brojer E. Potential of next-generation sequencing to match blood group antigens for transfusion.Int J Clin Transfus Med. 2019; 7: 11-22Crossref Google Scholar Reference sequences for blood group genes are important for effective red cell genotyping using NGS.10Srivastava K. Lee E. Owens E. Rujirojindakul P. Flegel W.A. Full-length nucleotide sequence of ERMAP alleles encoding Scianna (SC) antigens.Transfusion. 2016; 56: 3047-3054Crossref PubMed Scopus (8) Google Scholar Unlike HLA,11Robinson J. Halliwell J.A. Hayhurst J.D. Flicek P. Parham P. Marsh S.G. The IPD and IMGT/HLA database: allele variant databases.Nucleic Acids Res. 2015; 43: D423-D431Crossref PubMed Scopus (1326) Google Scholar there is a lack of experimentally confirmed allele information for blood group genes, which is needed to improve the inference accuracy.12Srivastava K. Wollenberg K.R. Flegel W.A. The phylogeny of 48 alleles, experimentally verified at 21 kb, and its application to clinical allele detection.J Transl Med. 2019; 17: 43Crossref PubMed Scopus (2) Google Scholar A database of full-length alleles of all blood group genes among the samples used by the FDA to develop reference panels will aid in the development, validation, and proficiency testing of new blood group genotyping assays using NGS.12Srivastava K. Wollenberg K.R. Flegel W.A. The phylogeny of 48 alleles, experimentally verified at 21 kb, and its application to clinical allele detection.J Transl Med. 2019; 17: 43Crossref PubMed Scopus (2) Google Scholar The ACKR1 gene encodes a multipass transmembrane glycoprotein that carries the five antigens of the Duffy (Fy) blood group system.13Tournamille C. Le Van Kim C. Gane P. Cartron J.P. Colin Y. Molecular basis and PCR-DNA typing of the Fya/fyb blood group polymorphism.Hum Genet. 1995; 95: 407-410Crossref PubMed Scopus (115) Google Scholar The two major and clinically most significant antithetical antigens, Fya and Fyb, have been implicated in severe hemolytic transfusion reactions and hemolytic disease of the fetus and newborn.14Poole J. Daniels G. Blood group antibodies and their significance in transfusion medicine.Transfus Med Rev. 2007; 21: 58-71Crossref PubMed Scopus (126) Google Scholar, 15Meny G.M. The Duffy blood group system: a review.Immunohematology. 2010; 26: 51-56PubMed Google Scholar, 16Meny G.M. An update on the Duffy blood group system.Immunohematology. 2019; 35: 11-12PubMed Google Scholar Our previous work on the Duffy blood group system established long-range ACKR1 reference alleles in a native East-African population at 5178 nucleotides using Sanger sequencing.17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar Recently, Fichou et al18Fichou Y. Berlivet I. Richard G. Tournamille C. Castilho L. Férec C. Defining blood group gene reference alleles by long-read sequencing: proof of concept in the ACKR1 gene encoding the Duffy antigens.Transfus Med Hemother. 2020; 47: 23-32Crossref PubMed Scopus (4) Google Scholar defined 19 haplotypes at 2488 nucleotides using the third-generation, single-molecule, real-time (SMRT; Pacific Biosciences, Menlo Park, CA) sequencing platform to aid in the imputation and phasing of high-throughput sequencing data. Genotype phasing is a process that determines if variants found in a gene sequence constitute an allele or a haplotype (in cis, on the same chromosome) or belong to two separate alleles (in trans, on the two chromosomes of an individual). Genotype phasing is critical for diagnostic purposes. Genotype phasing is also an objective of the analysis for data derived from NGS platforms, especially to detect rare allele combinations. Phasing of variant sites in long-range nucleotide sequences has been accomplished previously using a combination of massively parallel sequencing (MiSeq; Illumina, San Diego, CA) with low error rate and single-molecule sequencing (GridION; Oxford Nanopore Technologies, Oxford, UK) with a higher error rate.19Duke J.L. Mosbruger T.L. Ferriola D. Chitnis N. Hu T. Tairis N. Margolis D.J. Monos D.S. Resolving MiSeq-generated ambiguities in HLA-DPB1 typing by using the Oxford Nanopore Technology.J Mol Diagn. 2019; 21: 852-861Abstract Full Text Full Text PDF PubMed Scopus (8) Google Scholar We integrated the application of both platforms and rapidly established long-range ACKR1 alleles without ambiguity in the 18 samples of the FDA reference panel and 35 additional samples. The Center for Biologics Evaluation and Research of the FDA provided genomic DNA from 53 volunteer blood donor samples used for the production of B-lymphoblastoid cell lines (Supplemental Table S1). A subset of 18 cell lines was grouped as a reference panel from a renewable source of genomic DNA that was evaluated in a collaborative study3Volkova E. Sippert E. Liu M. Mercado T. Denomme G.A. Illoh O. Liu Z. Rios M. Validated reference panel from renewable source of genomic DNA available for standardization of blood group genotyping.J Mol Diagn. 2019; 21: 525-537Abstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar and established by the World Health Organization Expert Committee for Biological Standardization as additional International Reference Reagents for Blood Group Genotyping.20WHO Report of the international collaborative study to evaluate eighteen additional candidates for addition to the existing collection of four WHO international reference reagents for blood group genotyping.in: WHO Expert Committee on Biological Standardization. World Health Organization, Geneva2019Google Scholar This FDA reference panel3Volkova E. Sippert E. Liu M. Mercado T. Denomme G.A. Illoh O. Liu Z. Rios M. Validated reference panel from renewable source of genomic DNA available for standardization of blood group genotyping.J Mol Diagn. 2019; 21: 525-537Abstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar is a publicly available resource and was designed to encompass 41 genetic variants associated with 17 blood group systems present in the original group of 53 blood donors. The blood samples were collected previously with written informed consent3Volkova E. Sippert E. Liu M. Mercado T. Denomme G.A. Illoh O. Liu Z. Rios M. Validated reference panel from renewable source of genomic DNA available for standardization of blood group genotyping.J Mol Diagn. 2019; 21: 525-537Abstract Full Text Full Text PDF PubMed Scopus (3) Google Scholar in protocol BC12-15 approved by the Institutional Review Board of the BloodCenter of Wisconsin (Milwaukee, WI) and by the Research Involving Humans Subject Committee of the FDA (protocol number: 11-089B). The blood samples are unlinked from any personal donor information other than the donor's blood group, age, sex, and race, if available. Under the approved protocol, all methods of genetic characterization for blood group genes could be used to study these samples. Thus, no additional approval or consent were needed for the current study. An ACKR1 amplicon, 5782-bp long, encompassing the whole ACKR1 gene including upstream and downstream noncoding regions, was amplified by PCR using the universally tailed primers, 5′-TTTCTGTTGGTGCTGATATTGC-CAACCACTCCTCCCATGGCATT-3′ and 5′-ACTTGCCTGTCGCTCTATCTTC-GATGAGGAGGGGTTTCTGTCC-3′ (Eurofins MWG Operon, Louisville, KY) as described previously.17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar The PCR products were purified using the Agencourt AMPure XP (Beckman Coulter, Brea, CA) and quantified using the Qubit double-stranded DNA high-sensitivity Quantification Kit on a Qubit 2.0 Fluorometer (Life Technologies, Carlsbad, CA). The nucleotide sequencing covered 5615 nucleotides of the ACKR1 amplicon. At either end of the ACKR1 amplicon, 118 nucleotides at the 5′ end and 49 nucleotides at the 3′ end were missing because of insufficient sequencing coverage. Each sample was normalized to a concentration of 300 ng in 30 μL purified water to ensure an equal depth of coverage across the ACKR1 amplicon. Libraries were prepared (Nextera DNA Flex Library Preparation Kit with Nextera DNA CD Indexes, 96 plex; Illumina) and sequenced (MiSeq Reagent Kit v2; Illumina) using a read length of 2 × 150 bp. Each sample was normalized to a concentration of 45 ng in 24 μL purified water to ensure an equal depth of coverage across the ACKR1 amplicon. Libraries were prepared (Ligation Sequencing Kit 1D, SQK-LSK109 with the PCR Barcoding Expansion Pack 1-96, EXP-PBC096; Oxford Nanopore Technologies); the samples were combined in equal proportions, loaded onto a single FLOMIN-106 R9 flow cell (Oxford Nanopore Technologies), and sequenced on the GridION X5 platform (Oxford Nanopore Technologies). The MiSeq reads were demultiplexed and pairs of FASTQ files were generated (MiSeq software version 2.5.0.5; Illumina). Nanopore reads were base-called in real time on a GridION X5 system and data subsequently were demultiplexed (qcat, version 1.0.7; https://github.com/nanoporetech/qcat).21Munnink B.B.O. Nieuwenhuijse D.F. Stein M. O'Toole Á. Haverkate M. Mollers M. Kamga S.K. Schapendonk C. Pronk M. Lexmond P. van der Linden A. Bestebroer T. Chestakova I. Overmars R.J. van Nieuwkoop S. Molenkamp R. van der Eijk A.A. GeurtsvanKessel C. Vennema H. Meijer A. Rambaut A. van Dissel J. Sikkema R.S. Timen A. Koopmans M. Dutch-Covid-19 response team: rapid SARS-CoV-2 whole-genome sequencing and analysis for informed public health decision-making in the Netherlands.Nat Med. 2020; 26: 1405-1410Crossref PubMed Scopus (74) Google Scholar BWA-MEM22Li HW: Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. [ePub] arXiv 2013:1303.3997v1301.Google Scholar was used to align the data to the human reference genome assembly 38 (hg38) with default settings for the MiSeq reads and modified settings (-x ont2d) for the GridION reads. Aligned reads (BAM files) were examined visually for variants (Integrative Genomics Viewer; Broad Institute, Boston, MA).23Thorvaldsdottir H. Robinson J.T. Mesirov J.P. Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration.Brief Bioinform. 2013; 14: 178-192Crossref PubMed Scopus (4021) Google Scholar In samples containing more than one heterozygote variant, nanopore reads spanning the full-length amplicons were explored visually in Integrative Genomics Viewer to detect phase information (cis/trans relationship) of single-nucleotide variants (SNVs). We used Sanger sequencing17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar at three variant positions, c.-1896G>A, c.-1606C>T, and c.125G>A, to determine the accuracy of MiSeq data. The unphased ACKR1 genotype data from the 53 samples was used with Markov chain-based haplotyper MaCH software version 1.024Li Y. Willer C.J. Ding J. Scheet P. Abecasis G.R. MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.Genet Epidemiol. 2010; 34: 816-834Crossref PubMed Scopus (1431) Google Scholar to statistically infer alleles.17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar Because of the inherent uncertainty of computational phasing, the analysis was performed with MaCH program settings of 2000 rounds and 500 states.17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar The 95% CIs for allele frequencies were calculated using Poisson distribution.25Sachs L. Angewandte Statistik - Anwendung statistischer Methoden.ed 7. Springer-Verlag, Berlin1992: 446-447Google Scholar The observed genotype frequencies were examined for deviation from the Hardy–Weinberg equilibrium using a goodness-of-fit χ2-test with 1 df. ACKR1 variants, including SNVs and repeats, were detected and their phase information was accurately determined at 5615 bp in 53 DNA samples from the FDA repository. The sequencing covered 1011 nucleotides of the coding sequence, 480 nucleotides of the single intron, 947 nucleotides of the 5′-untranslated region, 50 nucleotides of the 3′-untranslated region, 2035 nucleotides of the 5′-flanking region, and 1092 nucleotides of the 3′-flanking region (Figure 1). MiSeq sequencing identified 30 SNVs and 3 dinucleotide repeats in the 53 samples (Table 1). One SNV (rs55872368) was observed as tri-allelic. One SNV was novel, whereas the remaining 29 variants already were listed in the dbSNP database (National Library of Medicine, Bethesda, MD). Besides the GATA box mutation (c.-67T>C), no other SNV indicative of a nonfunctional allele was detected. Sanger sequencing showed 100% concordance with the MiSeq data for the three variant positions: c.-1896G>A, c.-1606C>T, and c.125G>A.Table 1Genetic Variations Detected in the ACKR1 GeneLocationNucleotide change∗Nucleotide substitutions are shown relative to the reference sequence (NG_011626.3). Nucleotide positions are defined using the first nucleotide of the coding sequence of the NM_002036.3 isoform as nucleotide position 1.dbSNP reference number†Publicly available nucleotide sequences, as reported in the National Center for Biotechnology Information Nucleotide database (https://www.ncbi.nlm.nih.gov/nuccore, last accessed June 29, 2020).Protein residue change‡Relative to the National Center for Biotechnology Information Reference Sequence NP_002027.2.Observations (n = 53)HWE (P)Homozygote referenceHeterozygoteHomozygote variantVAF5′ Flanking region-2872_-2871TG>delrs5778112NA41660.170 Trs41313908NA45800.0760.552-2539C>TNANA52100.0090.945-2456T>Grs35432289NA52100.0090.945-1896G>Ars35333710NA47600.0570.662-1606C>Trs6676002NA391400.1320.268-1400C>Trs2746047NA51110.028 Trs3027008NA391400.1320.2685′ UTR-947delCrs11364458NA391400.1320.268-863A>Grs3027009NA45800.0750.552-673G>Ars41264467NA52100.0090.945-655A>Grs3027011NA49310.0470.007-627C>Trs3027012NA391400.1320.268-541C>Trs3027013NA45800.0750.552-399_-398CT>delrs71782098NA49310.0470.007-67T>Crs2814778NA391130.255 Trs200907215NA52100.0090.945+115T>Crs7550207NA361520.1790.781+150C>Trs863002NA291860.2830.235-243T>delrs17838198NA415340.7830.224-58A>Grs3027016NA47600.0570.662Exon 2125G>Ars12075Gly42Asp1018250.6420.057265C>Trs34599082Arg89Cys51200.1900.889298G>Ars13962Ala100Thr391220.1510.396714G>Ars36007769Gly238=52100.0090.945977C>Trs17851570Ser326Phe52100.0090.9453′ Flanking region+250C>Trs12042349NA421100.1040.399+268A>Grs863003NA361430.1880.318+591C>Trs863004NA1820150.4720.077+596_+597delGTrs72387739NA52100.0090.945+599T>Crs2281301NA431000.0940.448+616G>T or Ars55872368NA311390.2920.003+1083T>Crs863005NA515330.7640.118HWE, Hardy-Weinberg equilibrium; NA, not applicable; UTR, untranslated region; VAF, variant allele frequency.∗ Nucleotide substitutions are shown relative to the reference sequence (NG_011626.3). Nucleotide positions are defined using the first nucleotide of the coding sequence of the NM_002036.3 isoform as nucleotide position 1.† Publicly available nucleotide sequences, as reported in the National Center for Biotechnology Information Nucleotide database (https://www.ncbi.nlm.nih.gov/nuccore, last accessed June 29, 2020).‡ Relative to the National Center for Biotechnology Information Reference Sequence NP_002027.2. Open table in a new tab HWE, Hardy-Weinberg equilibrium; NA, not applicable; UTR, untranslated region; VAF, variant allele frequency. Detecting phase information in long amplicons with several distant heterozygous variants may be impossible through Sanger sequencing with primer walking. MiSeq sequencing, allowing short read lengths with an average of 300 bp in our study, cannot phase variants that are more than 300 bp apart. By using GridION, full-length, 5.6-kb ACKR1 amplicons could be sequenced as single reads. Long-range phased ACKR1 data were thus obtained, and the data sets from the two NGS technologies were analyzed in combination. This hybrid approach, novel for red cell genotyping, integrating MiSeq and GridION data, allowed accurate determination of 25 ACKR1 alleles at 5615 nucleotides each in the 106 chromosomes analyzed (Table 2).Table 2ACKR1 Allele Distribution in the FDA Reference SamplesGenBank number∗Publicly available nucleotide sequences, as reported in GenBank (https://www.ncbi.nlm.nih.gov/genbank, last accessed June 29, 2020).Alignment of variant positions†The nucleotides at the 30 SNV and three dinucleotide repeat (rs5778112, rs71782098, and rs72387739; underlined) positions are shown in 5′- to 3′-orientation (Table 1). The remaining 5579 nucleotide positions had no variation relative to the reference sequence NG_011626.3. All nucleotide variants in the ACKR1 reference are shown. For all other alleles, only nucleotides that differed from the reference are shown. The nucleotide in bold is the GATA box mutation (c.-67T>C). The dot symbols underneath represent nucleotide positions conserved in all alleles.Observations, nAllele frequency, %EthnicityMean‡Number of observed alleles × 100/total number of alleles.95% CI§95% CI, Poisson distribution, two sided.NG_011626.3TGCCTGCCCCAGACCCTTGTCTAGCGGCCACGTTGTNANANANAMN813501....................................2119.812.1–29.9Caucasian, HispanicMN813502..................................A.10.90.05–5.02CaucasianMN813503....................T-.............C43.81.3–9.1CaucasianMN813504...T................T-.A......T....C10.90.05–5.02CaucasianMN813505.....................-......T......C10.90.05–5.02CaucasianMN813506.....................-......T....C.C87.53.1–14.1Caucasian, African AmericanMN813507.....................-....A.T....C.C10.90.05–5.02CaucasianMN813508.....................-.....TT....C.C10.90.05–5.02CaucasianMN813509.....................-.A......T...TC10.90.05–5.02CaucasianMN813510.................C...-.A......T...TC43.81.3–9.1African AmericanMN813511--...............C...-.A......T...TC1413.27.6–21.6Caucasian, African AmericanMN813512--..G............C...-.A......T...TC10.90.05–5.02African AmericanMN813513--.....T.........C...-.A......T...TC32.80.8–7.6African AmericanMN813514...................C.-GA......T...TC43.81.3–9.1Caucasian, Native AmericanMN813515...........A.......C.-GA......T...TC10.90.05–5.02CaucasianMN813516..................TC.-GA......T...TC10.90.05–5.02CaucasianMN813517....................T-.A.A...GT--.TC10.90.05–5.02CaucasianMN813518....................T-.A.....GT....C43.81.3–9.1CaucasianMN813519....................T-.A.A...GT....C1312.36.3–20.2CaucasianMN813520....................T-.ATA...GT....C21.90.3–6.3CaucasianMN813521.....AT.T-...T......T-.A...........C32.80.8–7.6Caucasian, Hispanic, African AmericanMN813522.....AT.T-...T......T-.A............21.90.3–6.3CaucasianMN813523.....AT.T-...T......................10.90.05–5.02CaucasianMN813524..T...T.T-G..TT....C.-.A...........C87.53.1–14.1CaucasianMN813525............G..--C.C.-.A...........C54.71.8–10.5African AmericanTotal106100NAFDA, Food and Drug Administration; NA, not applicable.∗ Publicly available nucleotide sequences, as reported in GenBank (https://www.ncbi.nlm.nih.gov/genbank, last accessed June 29, 2020).† The nucleotides at the 30 SNV and three dinucleotide repeat (rs5778112, rs71782098, and rs72387739; underlined) positions are shown in 5′- to 3′-orientation (Table 1). The remaining 5579 nucleotide positions had no variation relative to the reference sequence NG_011626.3. All nucleotide variants in the ACKR1 reference are shown. For all other alleles, only nucleotides that differed from the reference are shown. The nucleotide in bold is the GATA box mutation (c.-67T>C). The dot symbols underneath represent nucleotide positions conserved in all alleles.‡ Number of observed alleles × 100/total number of alleles.§ 95% CI, Poisson distribution, two sided. Open table in a new tab FDA, Food and Drug Administration; NA, not applicable. By using ACKR1 genotype information (Supplemental Table S2) as input data, the MaCH software predicted 24 ACKR1 alleles (Supplemental Table S3). The confirmed ACKR1 alleles were compared with the simulated results by computational phasing. Of the 25 experimentally confirmed alleles, only 19 alleles (76.0%) were predicted correctly by MaCH, whereas 6 alleles (MN813502, MN813504, MN813508, MN813517, MN813520, and MN813523) were missed (Supplemental Table S3). Another five alleles (MaCH-01 to MaCH-05) (Supplemental Table S3), not present in any of the 53 samples, were predicted incorrectly by MaCH as single occurrences (Supplemental Tables S3 and S4). Relying on only computerized allele calling would result in 3.8% incorrect allele calls, potentially affecting 1 of 27 individuals (Table 3).Table 3ACKR1 Alleles: Computer Prediction by MaCH Compared with Physical SequencingTwo alleles per individualComputational allele prediction by MaCHRate, %Predicted alleles, nIndividuals, nBoth correct1025196.2Both incorrect423.8Total10653100 Open table in a new tab With the exception of rs863005 (c.+1083T>C), the other 32 SNV positions were sequenced in our previous study among 60 autochthonous Ethiopian individuals (Table 1).17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar However, all ACKR1 alleles detected in the present study were novel and not observed among the 18 Ethiopian alleles (Supplemental Table S5). As expected, the present study, with a mixed population, had a lower frequency (20%) of the Duffy-null allele (FY∗02N.01) than the Ethiopian study population (89%) (Supplemental Table S5). When comparing the number of variant sites in the ACKR1 alleles between the Ethiopian and FDA samples, the FDA allele MN813501 was found to have no difference from the ACKR1 reference allele (NG_011626.3) (Figure 2 and Supplemental Table S6). The FDA allele MN813501 was found to be closest to the Ethiopian allele MG932635, with only two differences at positions c.-243T>del (rs17838198) and c.125G>A (rs12075). The density of SNVs in the coding compared with the noncoding regions did not differ for the 25 alleles present in the FDA samples (P = 0.670; χ2 test, 2-sided), the 18 Ethiopian alleles17Yin Q. Srivastava K. Gebremedhin A. Makuria A.T. Flegel W.A. Long-range haplotype analysis of the malaria parasite receptor gene ACKR1 in an East-African population.Hum Genome Var. 2018; 5: 26Crossref PubMed Scopus (6) Google Scholar (P = 0.642), and the 43 alleles combined (P = 0.671)
Referência(s)