Initiation of DNA replication at CpG islands in mammalian chromosomes
1998; Springer Nature; Volume: 17; Issue: 8 Linguagem: Inglês
10.1093/emboj/17.8.2426
ISSN1460-2075
Autores Tópico(s)Epigenetics and DNA Methylation
ResumoArticle15 April 1998free access Initiation of DNA replication at CpG islands in mammalian chromosomes Sonia Delgado Sonia Delgado Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author María Gómez María Gómez Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author Adrian Bird Adrian Bird Institute of Cell and Molecular Biology, Darwin Building, King's Buildings, Edinburgh, EH9 3JR UK Search for more papers by this author Francisco Antequera Corresponding Author Francisco Antequera Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author Sonia Delgado Sonia Delgado Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author María Gómez María Gómez Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author Adrian Bird Adrian Bird Institute of Cell and Molecular Biology, Darwin Building, King's Buildings, Edinburgh, EH9 3JR UK Search for more papers by this author Francisco Antequera Corresponding Author Francisco Antequera Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain Search for more papers by this author Author Information Sonia Delgado1, María Gómez1, Adrian Bird2 and Francisco Antequera 1 1Instituto de Microbiología Bioquímica, CSIC/Universidad de Salamanca, Edificio Departamental, Campus Miguel de Unamuno, 37007 Salamanca, Spain 2Institute of Cell and Molecular Biology, Darwin Building, King's Buildings, Edinburgh, EH9 3JR UK ‡S.Delgado and M.Gómez contributed equally to this work *Corresponding author. E-mail: [email protected] The EMBO Journal (1998)17:2426-2435https://doi.org/10.1093/emboj/17.8.2426 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info CpG islands are G+C-rich regions ∼1 kb long that are free of methylation and contain the promoters of many mammalian genes. Analysis of in vivo replication intermediates at three hamster genes and one human gene showed that the CpG island regions, but not their flanks, were present in very short nascent strands, suggesting that they are replication origins (ORIs). CpG island-like fragments were enriched in a population of short nascent strands from human erythroleukaemic cells, suggesting that islands constitute a significant fraction of endogenous ORIs. Correspondingly, bulk CpG islands were found to replicate coordinately early in S phase. Our results imply that CpG islands are initiation sites for both transcription and DNA replication, and may represent genomic footprints of replication initiation. Introduction The biochemistry of DNA replication is relatively well understood in several viruses, organelles and prokaryotes whose genomes have unique and well-defined origins of replication (Kornberg and Baker, 1992; Kelman and O'Donell, 1994). The larger genomes of eukaryotes require large numbers of ORIs that must be coordinately activated once per cell cycle at the beginning of S phase. The best characterised eukaryotic ORIs are the Autonomously Replicating Sequences (ARS) of the yeast Saccharomyces cerevisiae which were identified by their ability to confer autonomous replication on plasmids. A comparable experimental strategy has been only partially successful in mammalian systems, because of the difficulty in isolating short DNA fragments capable of maintaining stable replication of plasmids (Heinzel et al., 1991). A different approach based upon the isolation of DNA fragments that replicate early during S phase, led to the identification of the first mammalian ORI downstream of the dihydrofolate reductase gene (DHFR) in Chinese hamster ovary (CHO) cells (Heintz and Hamlin, 1982). This ORI has been studied in detail over the years using a wide variety of methods (Vassilev et al., 1990; Hamlin and Dijkwel, 1995; Pelizon et al., 1996; Wang et al., 1998). The link between DNA replication and gene transcription was first recognised in viral systems where mutations in binding sites for transcription factors affected both transcription and replication (DeVilliers et al., 1984). Since then, many transcriptional activators have been reported to activate DNA replication (reviewed in DePamphilis, 1993a; van der Vliet, 1996). In S.cerevisiae, the B3 auxiliary element in the ARS1 origin binds the transcription factor ABF1 and mutations that affect binding reduce the efficiency of replication initiation at ARS1, both in plasmids and in the chromosome. The defect can be corrected by substituting the B3 element with binding sites for unrelated transcription factors (Marahrens and Stillman, 1992). These studies have revealed that ORIs contain a core element at which origin recognition proteins bind, plus auxiliary sequences that are necessary for efficient initiation. This modular structure is similar to that of eukaryotic promoters, where the preinitiation complex binds to an essential core element and a number of partially redundant elements are recognised by transcriptional activators. Mammalian ORIs have not been characterised in such detail and the link between gene transcription and DNA replication relies mainly on the correlation between early replication and transcriptional activity for a large number of genes (Goldman et al., 1984; Holmquist, 1989). This correlation has been further supported by the mapping of some ORIs close to gene promoters. For example, ORIs have been detected at promoters of the human ppv1 gene (Giacca et al., 1994), the Hsp70 gene (Taira et al., 1994), the c-myc gene (Vassilev and Johnson, 1990), the β-globin gene (Kitsberg et al., 1993) and the rat aldolase B gene (Zhao et al., 1994). CpG islands are associated with the promoters of ∼50% of all mammalian genes (Larsen et al., 1992; Antequera and Bird, 1993) and often contain multiple binding sites for transcription factors (Pfeifer et al., 1990; Somma et al., 1991) and an open chromatin conformation (Antequera et al., 1989; Tazi and Bird, 1990). They are regions of ∼1 kb that differ from the rest of the genome by being G+C-rich (65%) and free of methylation (Bird et al., 1985; Bird, 1986). By contrast, bulk genomic DNA is comparatively G+C-poor (40% on average) and heavily methylated at CpG. We have asked in this study whether CpG islands could be associated with replication origins in mammalian chromosomes. Predictions of this hypothesis are that CpG islands will be contained in nascent DNA strands smaller than their flanking regions and that they will replicate synchronously at the beginning of S phase. Our results confirm these predictions and indicate that CpG island regions are used to initiate transcription and DNA replication. These results suggest a possible origin of the CpG islands as genomic footprints left on the chromosome by the replication initiation event. Results Replication order of CpG islands relative to their flanking regions Figure 1A shows the genomic maps of the hamster Adenine Phosphoribosyl Transferase gene (APRT), the Growth Arrest and DNA Damage inducible gene (GADD) and the Thymidine Kinase gene (TK). The positions of CpG islands (brackets) and the location of sequences that were amplified by PCR are indicated. To test the replication order of CpG islands relative to their adjacent regions, nascent DNA strands derived from these CpG islands and their flanking sequences in CHO cells were examined. We followed the method of Vassilev et al. (1990) as modified by Virta-Pearlman et al. (1993), which determines the size of nascent strands emanating from a replication origin in an asynchronous population of cells. The closer a particular sequence is to an ORI, the smaller the size of the nascent strands in which it will be contained. Nascent DNA strands from unsynchronised CHO cells were labelled by a 30 min pulse with bromodeoxyuridine (BrdU). Density-labelled single strands were then isolated in alkaline cesium gradients and size-fractionated by agarose gel electrophoresis. DNA was recovered from gel slices and used as input for PCR amplification with the appropriate pairs of primers. PCR products were electrophoresed on gels, blotted onto filters and hybridised to internal primers. Appropriate controls were set up to ascertain that reactions took place under non-saturating conditions and that all primer pairs amplify with comparable efficiencies from total genomic DNA (data not shown). Figure 1.Replication of CpG islands relative to their flanking regions. (A) Genomic maps of the APRT, GADD and TK hamster genes. Black rectangles represent exons and arrows indicate the initiation site and direction of transcription. Brackets covering the 5′ end of the genes show the position of the CpG islands. White bars represent the fragments amplified by PCR. Numbers indicate their distance in kb from the transcription initiation point. Amplified regions within the islands are labelled CpG. The 1 kb scale bar is different in each case. (B) BrdU-labelled nascent strands were isolated in alkaline cesium gradients and size-fractionated by agarose gel electrophoresis before being used as input DNA to PCR-amplify the fragments indicated above. Amplified products were blotted onto filters and hybridised to internal primers. The size of nascent strands is indicated. Top, middle and bottom panels show results for the APRT, GADD and TK genes respectively. Download figure Download PowerPoint The results showed that in all three cases, sequences within the CpG islands were amplified from nascent strands of small size that contained little or no sequences derived from the island flanks (Figure 1B). In fact, the island primers detected nascent strands smaller than 1 kb, which locates the origin of replication within the island region. The smallest nascent strands that contained sequences 4.5 kb upstream or 1.8 kb downstream of the APRT CpG island were ∼8–9 and 3–4 kb, respectively, which is consistent with replication proceeding bidirectionally from the CpG island. In the case of the GADD gene, there were no significant differences between the sizes of the smallest nascent strands that contained sequences 1.5 kb upstream and 2.6 kb downstream of the island, but neither flank could be amplified from the small fragments that contained the CpG island. In the case of the TK gene, upstream sequences were not tested but the island region was also represented in nascent strands smaller than 1 kb, while those containing the 1.5 kb downstream region were at least 3.0 kb long. Measurement of nascent strands abundance by competitive PCR The above results were confirmed by an independent approach using alkaline sucrose gradients to fractionate DNA. Two parallel gradients were run containing identical amounts of total CHO DNA derived from asynchronous exponentially growing cells and from non-cycling cells arrested at the G1/S phase with mimosine (Mosca et al., 1992). This inhibitor effectively prevents entry into S phase (Figure 6A). Six fractions were collected from the gradients and an equal volume from each fraction was electrophoresed in an agarose gel, blotted and hybridised to total DNA to monitor fractionation. Under these conditions the great majority of DNA pelleted in fraction 6 and therefore only 1/100 of the volume of this fraction relative to fractions 1–5 was used both in the gel and as input for PCR (see below). Figure 2A shows that fractions 1–5 of the gradient derived from cycling cells contained DNA strands that were undetectable in the fractions of mimosine-arrested cells indicating that they are nascent strands derived from DNA replication intermediates. When fractions from the gradient were assayed with the primer pairs within and flanking the APRT CpG island (Figure 1), fraction 1 was found to contain the island region, but not sequences upstream and downstream of the island. The downstream primer pair, which is closer to the island, amplified fractions that were somewhat smaller than those amplified by the upstream primers. This agrees with the data in Figure 1, and again suggests that the replication origin is located at the CpG island. No signal was detected in non-cycling cells indicating that PCR products were amplified from replication intermediates. Figure 2.Measurement of nascent strand abundance by competitive PCR. (A) Alkaline sucrose gradient fractionation of total CHO DNA from asynchronous exponentially growing cells and non-cycling cells arrested at the G1/S boundary. Six fractions were collected and an equal volume from each was electrophoresed in an alkaline agarose gel, except for fraction 6 which contained parental DNA, where only 1/100 of the volume relative to fractions 1–5 was loaded and used as input for PCR. Electrophoresed DNA was blotted and visualised by hybridisation using total CHO DNA as a probe. Panels below show amplified PCR products from each fraction hybridised to internal primers as in Figure 1. (B) Quantitative measurement by competitive PCR of the CpG island, upstream and downstream flanking regions in gradient fraction 3. The ratio between the amount of competitor and genomic target is linearly related to the number of competitor molecules. The correlation coefficient of the fitted straight line was 0.991 for the upstream, 0.904 for the CpG island and 0.992 for the downstream regions, respectively. Arrows point to the number of molecules of competitor at which the ratio competitor/template is 1.0. (C) Histograms show the number of molecules for the CpG island (black), upstream (white) and downstream (striped) regions in gradient fractions 1, 2 and 3. The number of molecules containing the upstream and downstream regions was below the level of detection in fractions 1 and 2. Download figure Download PowerPoint The abundance of the nascent strands containing the APRT CpG island and flanks was accurately measured by competitive PCR as reported by Diviacco et al. (1992) (Figure 2B and C). Competitors were constructed as described in Materials and methods. Arrows in Figure 2B point to the number of molecules of competitor in gradient fraction 3 at which the competitor/template ratio is 1.0. Figure 2C shows the number of molecules for each of the three APRT regions in gradient fractions 1, 2 and 3. The differences in the relative number of molecules agree with the intensities of the PCR bands in Figure 2A. Short DNA nascent strands are enriched in CpG islands The finding that three CpG islands selected at random were associated with ORIs raised the possibility that CpG island regions could represent a significant fraction of all chromosomal ORIs. To test this we isolated a population of short DNA nascent strands that should be enriched in ORIs from human K562 erythroleukaemia cells and asked whether fragments with the characteristics of CpG islands were also enriched. Asynchronous exponentially growing cells were pulsed with BrdU for 30 min and total genomic DNA was fractionated in an alkaline sucrose gradient (Figure 3A). Fraction 1 contained fragments up to 1.5 kb long and DNA nascent strands were further purified from fraction 1 by two rounds of immunoprecipitation with a monoclonal anti-BrdU antibody (Vassilev et al., 1990). To monitor for enrichment in ORIs relative to bulk DNA, we amplified by PCR a fragment at the 5′ end of the β-globin gene (Figure 3C and D). An active ORI has been found to overlap the β-globin promoter in this cell line (Kitsberg et al., 1993) and, therefore, this region would be expected to be present in fraction 1. As a control, we used primers distal to the ORI corresponding to a region 4 kb upstream of the gene (Figure 3C). These primers were unable to amplify from the same fraction, thereby confirming the enrichment for the β-globin ORI. Both pairs of primers amplified the correct fragments with equal efficiency from total genomic DNA (Figure 3B) and from fractions 2–6 of the gradient after immunoprecipitation (data not shown). We also tested whether the CpG island spanning the promoter of the TrkA proto-oncogene (Figure 3C and D), which is actively transcribed in this cell line (Martín-Zanca et al., 1986), was contained among the immunoprecipitated strands from fraction 1. As shown in Figure 3B, primers from the island region amplified the expected fragment, but primers corresponding to a downstream region across exons 9 and 10 did not. Thus, the TrkA CpG island is contained within DNA nascent strands 0.6 (compared with an average of 0.25 for bulk DNA) and therefore qualify as candidate CpG islands (Bird, 1986; Gardiner-Garden and Frommer, 1987). Another property of CpG islands is their lack of methylation. To assess the methylation status of the clones we hybridised them to filters of K562 genomic DNA digested with MspI and HpaII. Both enzymes recognise the same sequence (CCGG), but HpaII is inhibited when the internal cytosine is methylated whereas MspI is insensitive to methylation. Figure 4A shows examples of clones derived from methylated or unmethylated genomic sequences (clones 64 and 77, respectively). The methylation data (Table I) confirmed the CpG island-like character of the six G+C-rich clones with an observed/expected CpG ratio >0.6. In addition, database searching revealed that clone 74A was derived from the CpG island associated with the human Insulin-like Growth Factor II gene (IGF II). Clone 79 was 100% homologous to the 5′ end of a basic type II keratin mRNA, suggesting that it is derived from its promoter region. Several other members of the keratin gene family have CpG islands at their promoters (Larsen et al., 1992), and it is therefore likely that this non-methylated, CpG-rich sequence is also included in a CpG island. Altogether, six out of 35 clones (17%) isolated from a genomic fraction enriched in DNA nascent strands 10-fold in the nascent strand library. It is likely that the observed proportion of clones that are derived from CpG islands represents a minimum estimate (see also below), as any contamination from bulk genomic DNA will usually contribute to the non-island category (for example, clone 8). Figure 4.Methylation analysis of short nascent DNA strands. (A) Genomic DNA of K562 cells was digested with PstI (P), PstI + HpaII (P/H), PstI + MspI (P/M), HpaII (H) and MspI (M). Restriction fragments were electrophoresed, blotted and hybridised with clones 77 and 64 and with LINE-1 clones 30, 31 and 38 as probes. Arrowheads point to bands specifically detected with each LINE-1 clone. m, size markers in kb. (B) Genomic map of a full length human LINE-1 element. The two open reading frames (ORF1 and ORF 2) and the 5′ untranslated region containing the promoter (5′ UT) are indicated. Diagrams below represent the three LINE elements from which clones 30, 31 and 38 (white rectangles) are derived. Vertical lines represent CpG dinucleotides and PstI sites. Brackets indicate the approximate position of the CpG island. Download figure Download PowerPoint Short nascent strands derived from the promoter of LINE elements Inserts of clones 30, 31 and 38 were not obviously CpG island-like, but surprisingly gave similar patterns of multiple bands on Southern blots (Figure 4A; see below). DNA sequence comparisons showed that they were 85–90% identical to one another. Database analysis further showed that clones 30, 31 and 38 were derived from equivalent positions in LINE-1 (Long Interspersed Elements) repeats. Initially, we did not consider that they were derived from CpG islands because their observed/expected CpG frequencies were well below 0.6 (Table I), but when aligned relative to a full-length LINE element their close proximity to the promoter CpG island became evident (Figure 4B). Most restriction fragments generated by MspI and HpaII were identical, confirming that they were derived from non-methylated DNA. In addition to common bands, each probe hybridised to a specific fragment (arrowheads in Figure 4A) which probably corresponds to the genomic source for that clone. The clone-specific bands were also non-methylated. LINE-1 elements are often heavily methylated and functionally inactive due to truncations and nonsense mutations (Hutchinson et al., 1989). Some full-length elements, however, have been reported to have a non-methylated CpG island at their 5′ end (Nur et al., 1988). Our data indicate that clones 30, 31 and 38 are close to origins of DNA replication and appear to be derived from the non-methylated CpG islands of intact elements which are potentially active (Sassaman et al., 1997). Including the three LINE sequences, it follows that nine out of 35 clones (26%) in the nascent strand library are within or immediately adjacent to CpG islands. Replication of CpG islands during S phase If many CpG islands are associated with ORIs, they would be expected to replicate coordinately in early S phase. To test this prediction we used CHO cells because they can be easily and tightly synchronised. CHO cells were arrested in G1 by isoleucine deprivation and then released by addition of the amino acid (Johnson et al., 1993). Samples were labelled at 40 min intervals during the first 8.0 h upon release and synchrony was monitored by 3H-thymidine incorporation (Figure 5A, black circles). To quantitate incorporation at CpG islands relative to total DNA, genomic DNA from each sample was digested to completion with HpaII and the resulting restriction fragments were fractionated in neutral sucrose gradients. A representative gradient profile is shown in Figure 5B. As HpaII is inhibited by methylation of CpG, it generates a population of very small fragments derived mainly from CpG islands (Figure 5B, fractions 1 and 2) and another population of very large fragments comprising the highly methylated remainder of the genome (Figure 5B, fractions 5–7). We have previously shown that fragments in fraction 1 are derived almost exclusively from CpG islands (Antequera and Bird, 1993). The proportion of total genomic radioactivity that was in fraction 1 allowed measurement of the proportion of all CpG islands that were replicated at each time interval. The result indicates that CpG island replication is essentially complete within the first third of the S phase, reaching a maximum at ∼2.0 h into S phase (Figure 5A, white circles). Figure 5.Replication of bulk DNA and CpG islands during S phase in CHO cells. (A) Cells were synchronised by isoleucine starvation. 3H-thymidine incorporation (in thousands of c.p.m.) in total DNA (black circles) and in CpG islands (white circles) was measured at the indicated times. Incorporation values in the CpG island fraction have been scaled up by a factor of 50 to fit within the same diagram. As isoleucine starvation arrests cells in G1, no 3H-thymidine incorporation was detected during the first 3 h upon addition of the amino acid. (B) Representative neutral sucrose gradient used to estimate 3H-thymidine incorporation at CpG islands. The figure shows one of the time samples and illustrates how correct fractionation was monitored by end-labelling a small amount of DNA in each of the seven fractions with [α-32P]dCTP and visualisation by autoradiography after agarose gel electrophoresis. m, size markers (kb). Download figure Download PowerPoint Figure 6.Replication of individual CpG islands during S phase. (A) CHO cells were synchronised with mimosine and progress through S phase was followed by 3H-thymidine incorporation (black circles). Synchronous entry into S phase was monitored by BrdU incorporation and immunofluorescence of nuclei with anti-BrdU antibodies (white circles). (B) Cells were pulsed with BrdU for 30 min prior to the indicated times and immunoprecipitated nascent strands were used as input for PCR. The replication profile of the APRT (circles), EF2 (triangles) CpG island regions and a 3′ region of a LINE repetitive sequence (squares) corresponds to the measurement of radioactivity in the autoradiographs shown below. (C) APRT, EF2 and LINE PCR-amplified fragments were hybridised with internal primers as a control for specificity prior to quantitation. Download figure Download PowerPoint This result was confirmed by monitoring the replication profile of two individual hamster CpG islands, one associated with the APRT gene and another with the Elongation Factor 2 (EF2) gene. For this experiment CHO cells were synchronised with mimosine which arrests close to the G1/S phase boundary (Figure 6A, black circles) (Mosca et al., 1992). Cells were labelled for 30 min with BrdU at various times after release from the mimosine block and nascent DNA strands were isolated by two rounds of immunoprecipitation (Vassilev et al., 1990). The resulting DNA was used as input for PCR amplification of regions of the APRT and EF2 CpG islands (Figure 6C). Quantitative results in Figure 6B show that the two islands were amplified preferentially from samples that were pulse-labelled early after release of the mimosine block, and gave a peak of replication at 2.5 h after the onset of S phase. Replicati
Referência(s)