Artigo Acesso aberto Revisado por pares

The quest for suitable reference standards in genome size research

2010; Wiley; Volume: 77A; Issue: 8 Linguagem: Inglês

10.1002/cyto.a.20907

ISSN

1552-4930

Autores

Jan Suda, Ilia J. Leitch,

Tópico(s)

Chromosomal and Genetic Variations

Resumo

Cytometry Part AVolume 77A, Issue 8 p. 717-720 Communication to the EditorFree Access The quest for suitable reference standards in genome size research J. Suda, J. Suda Department of Botany, Faculty of Science, Charles University in Prague, Benátská 2, CZ-128 01 Prague, Czech Republic Institute of Botany, Laboratory of Flow Cytometry, Academy of Sciences of the Czech Republic, Průhonice 1, CZ-252 43, Czech RepublicSearch for more papers by this authorIlia J. Leitch, Corresponding Author Ilia J. Leitch [email protected] Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United KingdomJodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, UKSearch for more papers by this author J. Suda, J. Suda Department of Botany, Faculty of Science, Charles University in Prague, Benátská 2, CZ-128 01 Prague, Czech Republic Institute of Botany, Laboratory of Flow Cytometry, Academy of Sciences of the Czech Republic, Průhonice 1, CZ-252 43, Czech RepublicSearch for more papers by this authorIlia J. Leitch, Corresponding Author Ilia J. Leitch [email protected] Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, United KingdomJodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AB, UKSearch for more papers by this author First published: 20 May 2010 https://doi.org/10.1002/cyto.a.20907Citations: 29AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Even in the molecular genomics era, as the flood gates open to deliver a deluge of DNA sequence data, knowledge of the total amount of DNA that comprises an organism's genome (= genome size) remains vital for many diverse fields of biology. Indeed, it may be considered to be a key biodiversity character with both practical and biological consequences for the organism. Knowledge of genome size is important, for example, in ongoing and planned sequencing programs to assess costs and to know how much DNA to sequence, or for determining how many large insert clones (e.g., bacterial artificial chromosomes [BACs] or fosmids) are needed for constructing genomic libraries. In addition, various genetic fingerprinting tools (e.g., microsatellites and AFLPs) have been shown to be sensitive to genome size so knowledge of this character is important before embarking on such studies. From a biological perspective variation in genome size has been shown to have diverse yet predictable consequences for an organism, influencing, for example, how it will respond to changes in carbon dioxide, rising temperatures, and pollution (1). While people have been estimating genome sizes in plants and animals for over 50 years, the last decade has seen a huge growth in the number of estimates published, not only to provide data for molecular studies but also for large scale comparative analyses seeking to understand the biological and evolutionary significance of the 40,000-fold range of genome sizes encountered across eukaryotes (2). Indeed, data are now available for over 10,000 species, accessible through the internet in three databases (Animal Genome Size Database—www.genomesize.com; Fungal Genome Size Database—www.zbi.ee/fungal-genomesize; and the Plant DNA C-values Database—http://data.kew.org/cvalues). ISSUES FOR GENOME SIZE STUDIES USING FLOW CYTOMETRY Nevertheless, as Doležel and Greilhuber point out in their paper "Nuclear genome size: Are we getting closer?" (3), despite this surge in data generation it has become increasingly clear that estimating genome size in absolute units (either picograms (pg) of DNA or number of base pairs (bp), usually Mbp) is still far from trivial, with numerous factors influencing the reliability of the data. The problem has been particularly exacerbated by the replacement of Feulgen microdensitometry with flow cytometry (FCM) as the method of choice for genome size measurement (4), especially in plants where 80% of estimates made since 2000 used FCM. This is because FCM is often perceived to be a quick, easy, and reliable method and hence, in many cases, has been used uncritically, resulting in the generation of dubious data. Fortunately, the last decades have also seen carefully conducted research into the application of FCM for genome size studies and this has revealed several important factors which must be carefully considered, both for sample preparation and in the interpretation of the results, if reliable FCM data are to be generated (4, 5). Some of these issues, which are discussed by Doležel and Greilhuber, have been properly settled in recent years including: (i) widespread agreement on the conversion factor to use when changing between picograms and megabase pairs, i.e., 1 pg = 978 Mbp (6), (ii) refinement of terminology to avoid ambiguity (7), and (iii) the recognition that only intercalating fluorochromes are suitable for genome size studies of which propidium iodide and ethidium bromide are the most widely used (4). However, other issues remain unresolved and the focus of ongoing work. For example, it is now well recognized that cytosolic compounds can act as staining inhibitors, affecting the stoichiometry of fluorochrome binding to DNA, and hence, the accuracy of genome size estimates. However, solutions are less forthcoming. Over the years, over 50 buffers or buffer modifications in plants (8) and animals (9) have been developed in an attempt to overcome such problems. Nevertheless, there is clearly still much work ahead to optimize buffer composition so that it fully protects the DNA, avoids the negative effect of staining inhibitors, and makes the DNA-fluorochrome complex stable for a sufficient period of time to enable reliable genome size estimates to be obtained. Another aspect in which we are still profoundly ignorant is the effect of using different tissue types (e.g., leaves versus seeds in plants and blood cells versus hepatocytes in animals) as a consequence of different DNA compaction, etc. Yet, this is an area which clearly needs to be addressed, particularly as FCM researchers are increasingly extending the range of tissues used, often in an attempt to overcome problems arising from cytosolic inhibitors. Such studies are essential if meaningful insights are to be gained from comparative studies between species whose genome sizes have been estimated from different tissues. THE NEED FOR RELIABLE REFERENCE STANDARDS Probably one of the most important issues currently facing the field of genome size research is standardization. There is now no doubt that internal standardization (i.e., coprocessing an unknown sample with the reference material) is the sole option for genome size studies (4). However, the lack of agreement between genome size practitioners as to which species to use for calibration reference standards and what genome size to assume for converting relative units into absolute DNA amounts is a serious and controversial issue. Indeed, it is likely that this issue alone has contributed much more to the artefactual genome size variation apparent in the FCM literature than any other factor, with the possible exception of bias introduced because of the presence of interfering secondary metabolites. For example, while the use of alternative isolation buffers can result in differences in genome size estimates, these are usually small, even for challenging plant samples [<4% our own data, 8% in bryophytes (10)]. In contrast, the genome size assigned to a reference species can differ by more than 3-fold [e.g., Arabidopsis thaliana (8)], leading to apparent differences in the calculated genome size depending on what value is used for conversion. The seriousness of the problem is dealt with in detail by Doležel and Greilhuber (3) who outline the issues at stake and suggest possible ways forward for ongoing genome size research. The ideal situation for standardizing the field would be to have one calibration standard (i) which meets the criteria outlined in Table 1, (ii) whose genome size has been accurately determined chemically or by sequencing techniques and agreed upon, and (iii) is suitable for both animal and plant studies. However, it is clear from both a biological and methodological perspective that this is impossible. Table 1.. Characteristics of an ideal calibration standard for genome size research Wide availability as a pure line from more than one source Cytologically stable and preferably a diploid with a low number of somatic chromosomes (to minimize the risk of genome size variation due to aneuploidy, supernumerary segments, B-chromosomes, etc.) Absence of secondary metabolites interfering with DNA staining or sample preparation (such as polyphenolics, anthocyanins, caffeine, mucus, etc.) Quick and easy to grow with low demands on cultivating/culturing requirements (enabling it to be grown successfully in a wide range of conditions in different geographic regions) Suitable for FCM protocols (e.g., plants forming large and soft leaves to make the tissue homogenization convenient, and long retaining turgor pressure; animals with reasonably large bodies supplying enough cells for numerous experiments) Production of high resolution and reproducible flow histograms (i.e., fluorescence uniformity and stability both within and among runs, insensitive to different isolation/staining buffers, formation of distinct peaks with low coefficients of variation, low background signal, etc.) Appropriate genome size which is representative of the analyzed group of organisms (e.g., close to the median value, which is 2.5 pg/1C and 1.91 pg/1C for angiosperms and vertebrates, respectively), reasonable proportion of heterochromatin (i.e., neither devoid of nor rich in it) Optional additional characteristic Suitability for both fluorometric and densitometric techniques First, genome size varies extensively across many animal and plant groups [e.g., Chlorophyta (green algae)—2,300-fold; angiosperms—c. 2,000-fold; crustaceans—460-fold; fishes—380-fold; flatworms—340-fold (1, 11)]. A single standard is, therefore, unsuitable for estimating genome sizes in all species of such groups because of problems of nonlinearity in the flow cytometer when samples with large differences in genome size are coprocessed. Ideally, the 2C peak of the target species should be located between the 2C and 4C peaks of the internal reference standard and in no case should the standard and sample differ by more than ∼4-fold in genome size (note that the threshold for linearity guaranteed by the flow cytometer manufacturers for samples differing 2-fold, e.g., G2/G1 nuclei, varies around 1.98–2.02). Second, the suitability of using animal standards to estimate genome sizes in plants and vice versa has been called into question, and indeed in plants it is recommended that this should not be done (5). Third, the low number of species whose genome sizes have been accurately determined using chemical methods. In the plant community there is general agreement that the 2C-value of Allium cepa (onion), estimated using chemical approaches, is 33.5 pg (12). However, given that most angiosperms have considerably smaller genomes (the median 2C-value for 6,425 species is 5.0 pg) Allium is an unsuitable calibration standard for the majority of species. It is certainly clear that in the near future no help in the accurate determination of standards' genome size values can be expected from any of the "complete" genome sequencing programs. This is because the concept of a "complete" genome sequence often reflects what can be sequenced rather than what needs to be sequenced to cover all the DNA in the nucleus. For example, out of the nine "completely sequenced" plant genomes, all have noted gaps in their sequencing scaffolds. These are considered to reflect regions of repetitive DNAs that are difficult to sequence and align. In some cases only the euchromatic portion of the genome is reported to be sequenced and assembled into scaffolds (e.g., Populus trichocarpa) leaving up to 30% of the genome unsequenced or unassembled and presumed to correspond to heterochromatic DNA. In others no attempt has been made to estimate a genome size from the sequence data. Instead the amount of DNA sequenced is based on a prior estimate of genome size made using flow cytometry (e.g., Cucumis sativa, Vitis vinifera, Zea mays). The situation in animals is more controversial with various genome size estimates reported for human (Homo sapiens) made using chemical approaches (2C-values range 6.0–7.0 pg, with no mention of whether the samples analyzed were male or female even though the presence of sex chromosomes will lead to genome size differences between the sexes). In many studies a 2C-value of 7.0 pg has been used although, as Doležel and Greilhuber (3) note, this is likely to be an overestimate. Nevertheless, in contrast to plants, there is more optimism that the great efforts being made to close the remaining gaps in the assembled "completely sequenced" genome will result in a precisely known size of the human genome within a reasonable time period. As the need for calibration standards whose genome size falls close to the species of interest has increased in importance, several plant and animal genome size researchers have used Allium or human as a primary standard to measure a selection of secondary standards in a cascade-like manner (see Table 2 for the list of the most widely used plant and animal standards). This has produced some stability in the field, with many researchers adopting the values given, for example, by Doležel et al. (13) for plants and those of Tiersch et al. (14) for animals. Nevertheless, increasingly researchers have made their own reference standards by calibrating them against another species whose genome size has already been estimated. This has led to a huge expansion in the number of plant and animal species used for calibration. Indeed from the Plant and Animal Genome Size databases (aforementioned) over 50 plant and 70 animal species have been used with a wide range of assumed DNA amounts (8). Table 2.. A list of commonly used plant and animal standards with their advantageous and disadvantageous characteristics Reference standard Key advantage(s) Key disadvantage(s) Plantsa Allium cepa (onion) Reliably determined genome size Very large genome size, low nuclei yield Arabidopsis thaliana Nearly-sequenced species, some mutants with low levels of secondary metabolites (e.g., flavonoid-free) Very small genome size, intraspecific variation up to 1.1-fold reported, high level of endopolyploidy (low amount of G0/G1 nuclei) Solanum lycopersicum (tomato) Rapid growth, easy to cultivate Presence of polyphenolic compounds, often large proportion of G2 nuclei Hordeum vulgare (barley) Rapid growth, easy to cultivate, stable genome size Genome size above optimum for the majority of angiosperms Oryza sativa (rice) Nearly-sequenced species Very small genome size Petunia hybrida (petunia) Rapid growth, easy to cultivate Not sufficiently tested for genome size stability Pisum sativum (pea) Genome size stability, low level of secondary metabolites, well proven for both fluorometric and densitometric techniques Genome size above optimum for the majority of angiosperms Zea mays (maize) Rapid growth, easy to cultivate, suitable genome size Intraspecific variation up to 30% reported, highly dynamic transposable elements Animalsb Betta splendens (Siamese fighting fish) Easy to culture Small genome size Caenorhabditis elegans (nematode) Fully-sequenced species Very small genome size Gallus domesticus (chicken) Easily available, suitable genome size Considerable variation between different strains and variation due to sex chromosomes Homo sapiens (human) Nearly-sequenced, easily available Sex-related variation Oncorhynchus mykiss (rainbow trout) Easily available, suitable genome size Chromosomal polymorphism between different strains Rana pipiens (northern leopard frog) Easy to culture Genome size above optimum a The five most commonly used plant standards listed in the FLOWer database (8) together with model dicot (Arabidopsis thaliana) and monocot (Oryza sativa) with nearly-sequenced genomes, and a species with an accurate, chemically-determined genome size (Allium cepa). b The five most commonly used animal standards listed in the Animal Genome Size Database (http://www.genomesize.com) and the first fully sequenced animal (Caenorhabditis elegans). Clearly this situation is untenable given the knock-on consequences of assuming different DNA amounts for a calibration standard noted above. Doležel and Greilhuber (3), therefore, suggest a sensible and appropriate holding solution to bring some stability to the field—"the logical strategy is to calibrate a primary 'gold' reference standard and then perform a series of experiments to calibrate secondary reference standards in an ordered sequence." They recommend using Homo sapiens (human) with a 2C-value of 7.0 pg as the overall primary "gold" standard together with Pisum sativum (garden pea) as the primary plant standard using 2C = 9.09 pg (based on calibrating it against human with 2C = 7.0 pg). While Doležel and Greilhuber accept that the value for human may be too high, the possibility to link animal and plant studies in this way opens up the opportunity to make realistic comparisons across eukaryotes. Further, once the genome of human is truly completely sequenced then all values can be adjusted accordingly. As for which species to adopt as the secondary standards, there are many to choose from but a number of commonly used ones are listed in Table 2 (together with their potential advantages and disadvantages). Given the current situation, the recommendations of Doležel and Greilhuber are timely and appropriate and it is urged that they are adopted by all practitioners in the field of genome size research. LITERATURE CITED 1 Leitch IJ,Bennett MD. Genome size and its uses: the impact of flow cytometry. In: J Doležel, J Greilhuber, J Suda, editors. Flow cytometry with plant cells. Weinheim: Wiley-VCH; 2007. pp 153– 176. 2 Gregory TR,Nicol JA,Tamm H,Kullman B,Kullman K,Leitch IJ,Murray BG,Kapraun DF,Greilhuber J,Bennett MD. Eukaryotic genome size databases. Nucl Acids Res 2007; 35(Database issue): D332– D338. 3 Doležel J,Greilhuber J. Nuclear genome size: Are we getting closer? Cytometry Part A 2010; 77A: in press [DOI: 10.1002/cyto.a.20915]. 4 Doležel J,Greilhuber J,Suda J. Estimation of nuclear DNA content in plants using flow cytometry. Nat Prot 2007; 2: 2233– 2244. 5 Greilhuber J,Temsch EM,Loureiro J. Nuclear DNA content measurement. In: J Doležel, J Greilhuber, J Suda, editors. Flow cytometry with plant cells. Weinheim: Wiley-VCH; 2007. pp 67– 102. 6 Doležel J,Bartoš J,Voglmayr H,Greilhuber J. Nuclear DNA content and genome size of trout and human. Cytometry Part A 2003: 51A: 127– 128. 7 Greilhuber J,Doležel J,Lysák MA,Bennett MD. The origin, evolution and proposed stabilization of the terms 'Genome size' and 'C-value' to describe nuclear DNA contents. Ann Bot 2005; 95: 255– 260. 8 Loureiro J,Rodriguez E,Santos C,Doležel J,Suda J. FLOWer: A Plant DNA Flow Cytometry Database (release 1.0, May 2008). Available at:http://flower.web.ua.pt/ [accessed 29 March 2010] 9 Kron P,Suda J,Husband BC. Applications of flow cytometry to evolutionary and population biology. Annu Rev Ecol Evol Syst 2007; 38: 847– 876. 10 Bainard J,Fazekas A,Newmaster S. Methodology significantly affects genome size estimates: quantitative evidence using bryophytes. Cytometry Part A 2010; 77A: 725– 732 (this issue). 11 Gregory TR. Genome size evolution in animals. In: TR Gregory, editor. The evolution of the genome. San Diego: Elsevier; 2005. pp 3– 87. 12 Van't Hof J,Sparrow AH. A relationship between DNA content, nuclear volume, and minimum mitotic cycle time. Proc Natl Acad Sci USA. 1963; 49: 897– 902. 13 Doležel J,Greilhuber J,Lucretti S,Meister A,Lysák MA,Nardi L,Obermayer R. Plant genome size estimation by flow cytometry: inter-laboratory comparison. Ann Bot 1998; 82( Suppl. A): 17– 26. 14 Tiersch TR,Chandler RW,Wachtel SS,Elias S. Reference standards for flow cytometry and application in comparative studies of nuclear DNA content. Cytometry 1989; 10: 706– 710. Citing Literature Volume77A, Issue8August 2010Pages 717-720 ReferencesRelatedInformation

Referência(s)
Altmetric
PlumX