Editorial Acesso aberto Revisado por pares

From 1949 to 2018: R. A. Fisher's Theory of Junctions

2018; Wiley; Volume: 135; Issue: 5 Linguagem: Inglês

10.1111/jbg.12351

ISSN

1439-0388

Autores

E. A. Thompson,

Tópico(s)

Genomics and Phylogenetic Studies

Resumo

Although this year we celebrate the centenary of Fisher's monumental foundation of quantitative genetics in 2018, I here celebrate a much later and lesser-known contribution: his Theory of Junctions. This theory was first introduced in his monograph, Theory of Inbreeding (1949, Oliver & Boyd, Edinburgh), and was developed further by Bennett (1953, Genetics 26: 392–406) and by Fisher himself (1954, Heredity 8: 187–197 and 1959, Heredity 13: 179–186). Although Fisher described this topic as important, it is only with current genomic data that it has become more broadly appreciated. Fisher introduced his theory in the context of production under close inbreeding of recombinant inbred lines of mice, whether by sib mating or by parent–offspring mating. The development is an algebraic tour de force but it is Fisher's intuition and insight that are more relevant today. At each point in the genome, the DNA is copied from some founder chromosome; junctions are the points at which the founder chromosome of origin changes. Thus, junctions are formed by recombination events in meiosis, but cannot be formed in regions where the parent is already identical by descent (IBD) having two copies of the same founder DNA in that region. Thus in an inbred line, in which the entire genome is eventually IBD, only a small number of junctions are formed. Fisher's insight was that junctions, once formed, behave like any novel variant: they are subject to drift, extinction or fixation under basic single-locus Mendelian computations. Thus, for example in sib mating, per Morgan of genome, 24 junctions are expected to be formed, of which six will become fixed, leading to chromosomes of the inbred line consisting of segments of the four founder chromosomes of average length about 17 cM (Fisher, 1954, ibid.). In the process towards complete IBD, Fisher's interest was in the numbers and lengths of non-IBD tracts. He therefore distinguished internal and external junctions. The latter bound segments of genome that are IBD in the lineage only on one side of the junction. Internal junctions are either within IBD tracts or within non-IBD tracts. For example, if a junction j is between founder chromosomes a and b, it is internal if both a and b still exist in the population on both sides of the junction. If then a is lost, the junction becomes external, bounding a non-IBD region where both a and b survive, and an IBD region where there is only b. External junctions and those internal junctions within non-IBD regions are transient, as eventually all the chromosomes in the line become IBD over their entire length. The remaining junctions internal to IBD regions are then fixed and define the segments of the inbred line that are of different ancestral origins. By considering the total number of junctions, and the formation and loss of internal junctions, Fisher was able to derive the expected number of non-IBD tracts. Together with standard formulae for the proportion of genome remaining non-IBD, this provides the expected lengths of tracts remaining non-IBD. Interestingly, in more recent analyses of the genomes of recombinant inbred lines (e.g., Broman, 2005, Genetics 169: 1133–1146), authors have followed two-locus analyses of Haldane and Waddington (1931, Genetics 16, 357–374), rather than the continuous-genome model of Fisher. Only more recently have Martin and Hospital (2011, Genetics 189: 645–654) considered the process in terms of Fisher's junctions. However, in population genetics, use of continuous-genome models, and hence Fisher's theory of junctions, has been more prevalent, starting with Stam (1980, Genet. Res. Camb. 35:131–155) who extended Fisher's theory to a random mating population, deriving the expected numbers of internal and hence also external junctions as a function of time and population size. Later direct application of Fisher's theory and the work of Stam (1980, ibid.) was in new approaches to the study of linkage disequilibrium (LD) and the lengths of haplotype blocks in structured populations (Chapman and Thompson 2002, Genetics 162: 449–458). While population structure and subdivision does affect the expected numbers of junctions, the large variance of the number of junctions predominates. In this context of the apparent block structure of LD, Macleod et al. (2005, Genet.Res. Camb. 85: 69–79) extended the work of Stam (1980, ibid.), and considered the total number of junctions, both internal and external, in a partially inbred population. They also focused on the distribution of junctions along a chromosome. Junctions are formed only in non-IBD parental material, which leads to a clustering of junctions in a highly inbred population. This overdispersion leads to an excess of long segments and of short ones, which in turn leads to difficulties in detecting junctions from genetic marker data. Even with fully informative markers, a marker density at least an order of magnitude greater than overall junction density is required to detect 90% of junctions. Detection of IBD within and among individuals does not, however, require the detection of all junctions. The boundary points’ segments of genome shared IBD are precisely Fisher's external junctions. Donnelly (1983, Theor. Pop. Biol. 23: 34–63) derived probabilities of the existence of segments of IBD between many types and degrees of relatives. As demonstrated by Donnelly, in remote relatives such IBD, segments are “rare but not short”. It is this that makes it possible to detect such segments using modern DNA (SNP) marker data. The increasing availability of dense genetic marker data led to ideas of genome mismatch scanning for genetic linkage analysis (Nelson et al. 1993, Nature Genetics 4: 11–18). Methods for its application in human pedigrees followed (Thomas et al. 1994, J. Math. Appl. Med. Biol. 11:1–16). A particular case of interest is in detection of IBD segments between the two haploid genomes of an individual. In IBD segments, if there are no data errors or mutations, the individual is homozygous at all markers, and such segments can be detected from runs of homozygosity or ROH (Broman and Weber 1999, Am. J. Hum. Genet. 65:1493–1500). The ROH approach is often used to estimate inbreeding in livestock populations, where the well-defined population structures, accurate marker genotyping and typical SNP marker density give clear results. In small natural populations, the distinction between LD and IBD becomes blurred, and genotype failures and small deletions may lead to both false-negative and false-positive estimates of IBD segments. Moreover, on the base-pair scale, per-meiosis mutation and recombination rates are of the same order of magnitude, so that one mutation is expected in every IBD segment. For dense markers or DNA sequence data, adjustments to ROH methods are needed (Browning and Browning 2011, Am. J. Hum. Genet. 88: 173–182). Although Fisher's theory is often not cited, Fisher's junctions are now pervasive in the Statistical Genetics literature. Ancient junctions bound segments of introgressed Denisovan and Neanderthal DNA in our genomes (Browning et al. 2018, Cell 173:53–61). More recent junctions delimit genome segments of different ethnic origin and are key to both association and admixture mapping (Thornton and Bermejo 2014, Genet. Epid. 38 Suppl 1: S5–S12). At any point in the genome, the most recent common ancestor (MRCA) of a collection of genomes is well-defined. Junctions bound the segment of DNA that is descended from this MRCA to the current genomes without recombination: the length distributions of these segments reveal fine-scale demographic history (Palamara et al. 2012, Am. J. Hum. Genet. 91: 809–822). Across the genome, relative to a time-point or founding population, segments of IBD among related individuals are key to all inference from genetic data on individuals whether or not their genealogical relationships are known (Thompson 2013, Genetics 194: 301–326). After almost 70 years, it is time to recognize and appreciate Fisher's Theory of Junctions. The algebraic tour de force and myriad tables of numerical results have likely deterred rather than encouraged this recognition, but Fisher's two key insights remain. First, junction formation can be analysed through simple models of recombination events occurring in non-IBD parental DNA. Second, single-locus Mendelian genetics governs the expectations of descent, loss or fixation of these junctions. All genomes, whether in experimental, natural, agricultural, livestock or human populations, consist of segments of DNA from different ancestral origins. These segments dictate the patterns of observable genetic variation in extant populations.

Referência(s)