Revisão Acesso aberto Revisado por pares

Meta'omic Analytic Techniques for Studying the Intestinal Microbiome

2014; Elsevier BV; Volume: 146; Issue: 6 Linguagem: Inglês

10.1053/j.gastro.2014.01.049

ISSN

1528-0012

Autores

Xochitl C. Morgan, Curtis Huttenhower,

Tópico(s)

Diet and metabolism studies

Resumo

Nucleotide sequencing has become increasingly common and affordable, and is now a vital tool for studies of the human microbiome. Comprehensive microbial community surveys such as MetaHit and the Human Microbiome Project have described the composition and molecular functional profile of the healthy (normal) intestinal microbiome. This knowledge will increase our ability to analyze host and microbial DNA (genome) and RNA (transcriptome) sequences. Bioinformatic and statistical tools then can be used to identify dysbioses that might cause disease, and potential treatments. Analyses that identify perturbations in specific molecules can leverage thousands of culture-based isolate genomes to contextualize culture-independent sequences, or may integrate sequence data with whole-community functional assays such as metaproteomic or metabolomic analyses. We review the state of available systems-level models for studies of the intestinal microbiome, along with analytic techniques and tools that can be used to determine its functional capabilities in healthy and unhealthy individuals. Nucleotide sequencing has become increasingly common and affordable, and is now a vital tool for studies of the human microbiome. Comprehensive microbial community surveys such as MetaHit and the Human Microbiome Project have described the composition and molecular functional profile of the healthy (normal) intestinal microbiome. This knowledge will increase our ability to analyze host and microbial DNA (genome) and RNA (transcriptome) sequences. Bioinformatic and statistical tools then can be used to identify dysbioses that might cause disease, and potential treatments. Analyses that identify perturbations in specific molecules can leverage thousands of culture-based isolate genomes to contextualize culture-independent sequences, or may integrate sequence data with whole-community functional assays such as metaproteomic or metabolomic analyses. We review the state of available systems-level models for studies of the intestinal microbiome, along with analytic techniques and tools that can be used to determine its functional capabilities in healthy and unhealthy individuals. Curtis HuttenhowerView Large Image Figure ViewerDownload Hi-res image Download (PPT) The human microbiome comprises approximately 1014 bacteria and archaea, as well as fungi and viruses, which comprise roughly one kilogram of the average adult's body weight.1Savage D.C. Microbial ecology of the gastrointestinal tract.Ann Rev Microbiol. 1977; 31: 107-133Crossref PubMed Google Scholar Our ability to study this remarkable system has changed dramatically in recent years—the cost of sequencing 1 million nucleotide bases of DNA decreased from $10,000 to $0.10 US dollars between 2001 and 2011 as a result of next-generation sequencing techniques.2Sboner A. Mu X.J. Greenbaum D. et al.The real cost of sequencing: higher than you think!.Genome Biol. 2011; 12: 125Crossref PubMed Scopus (39) Google Scholar This decrease in cost enabled large-scale surveys of human microbial diversity and function such as MetaHit3Qin J. Li R. Raes J. et al.A human gut microbial gene catalogue established by metagenomic sequencing.Nature. 2010; 464: 59-65Crossref PubMed Scopus (1408) Google Scholar and the Human Microbiome Project,4Human Microbiome Project ConsortiumStructure, function and diversity of the healthy human microbiome.Nature. 2012; 486: 207-214Crossref PubMed Scopus (425) Google Scholar which would have been prohibitively expensive as few as 15 years ago. These projects defined microbial compositions and their biomolecular functions in large populations. For the intestinal microbiota in particular, they showed hundreds of species, thousands of strains, and millions of bacterial genes.3Qin J. Li R. Raes J. et al.A human gut microbial gene catalogue established by metagenomic sequencing.Nature. 2010; 464: 59-65Crossref PubMed Scopus (1408) Google Scholar, 4Human Microbiome Project ConsortiumStructure, function and diversity of the healthy human microbiome.Nature. 2012; 486: 207-214Crossref PubMed Scopus (425) Google Scholar Importantly, they also spurred rapid growth in analytic techniques, education programs, and numbers of trainees equipped to tackle complex sequence data from microbial communities. Now that these studies have surveyed the wide degree of baseline microbial variation and sequenced thousands of human-associated microbe reference strains, the field is in a unique and unprecedented position to conduct well-informed investigations of the contribution of microbes to human health. Meta’omic studies may use 1 technique or combine several to address a multitude of questions, including but not limited to: whether changes in the composition of the microbiome are associated with, precede, follow, or cause the onset of disease; which microbial biochemical functions change in disease at the DNA, RNA, protein, and metabolite levels; how metabolic processes change in disease; and how interventions affect the composition and biomolecular function of the microbial community. Shotgun-sequencing techniques, which include both DNA-focused metagenomic and RNA-focused metatranscriptome analyses, are especially useful in integrating microbial membership with biomolecular potential and activity in the human intestine. The low cost of nucleotide-based approaches make them an important component of any high-throughput experimental toolbox. Shotgun sequencing, in particular, provides strain- and gene-level information that is difficult to obtain from other technologies. The rapid evolution of sequencing technologies has been paralleled by corresponding increases in the availability and diversity of analytic tools and pipelines, and these have been followed by increased use of metaproteomic and metabolomic data. We review a subset of current analytic approaches for the rapidly evolving new field of meta’omics, as well as benefits and pitfalls of these techniques. Culture of an organism has long provided, and still provides, one of the most detailed environments for study. However, most intestinal microbes are anaerobic and therefore are difficult to culture. In the mid-1970s, Woese et al5Woese C.R. Fox G.E. Zablen L. et al.Conservation of primary structure in 16S ribosomal RNA.Nature. 1975; 254: 83-86Crossref PubMed Scopus (46) Google Scholar noted that portions of the gene encoding the small subunit 16S ribosomal RNA (rRNA) (hereafter referred to as 16S) were highly conserved among bacteria. Other internal regions of the gene are highly variable, possessing almost entirely unique sequences in most bacterial clades. Polymerase chain reaction (PCR) amplification, with universal primers, of a pool of DNA from a community of microbes, followed by cloning and sequencing, provides marker genes (a molecular nametag) that can be used to quantify bacterial taxa present within a sample. The Sanger sequencing method originally applied to this system has progressed through 454 technologies6Margulies M. Egholm M. Altman W.E. et al.Genome sequencing in microfabricated high-density picolitre reactors.Nature. 2005; 437: 376-380Crossref PubMed Scopus (0) Google Scholar, 7Sogin M.L. Morrison H.G. Huber J.A. et al.Microbial diversity in the deep sea and the underexplored “rare biosphere”.Proc Natl Acad Sci U S A. 2006; 103: 12115-12120Crossref PubMed Scopus (1114) Google Scholar to Illumina (San Diego, CA) sequencing,8Bentley D.R. Balasubramanian S. Swerdlow H.P. et al.Accurate whole human genome sequencing using reversible terminator chemistry.Nature. 2008; 456: 53-59Crossref PubMed Scopus (1150) Google Scholar, 9Lazarevic V. Whiteson K. Huse S. et al.Metagenomic study of the oral microbiota by Illumina high-throughput sequencing.J Microbiol Methods. 2009; 79: 266-271Crossref PubMed Scopus (99) Google Scholar which can sequence tens of thousands of 16S genes from a single sample with unprecedented affordability. Methods for analyzing 16S sequencing data from the human microbiome and other environments are now well developed and have been reviewed elsewhere.10Hamady M. Knight R. Microbial community profiling for human microbiome projects: tools, techniques, and challenges.Genome Res. 2009; 19: 1141-1152Crossref PubMed Scopus (193) Google Scholar, 11Jumpstart Consortium Human Microbiome Project Data Generation Working GroupEvaluation of 16S rDNA-based community profiling for human microbiome research.PLoS One. 2012; 7: e39315Crossref PubMed Scopus (25) Google Scholar, 12Degnan P.H. Ochman H. Illumina-based analysis of microbial community diversity.ISME J. 2012; 6: 183-194Crossref PubMed Scopus (22) Google Scholar, 13Kuczynski J. Stombaugh J. Walters W.A. et al.Using QIIME to analyze 16S rRNA gene sequences from microbial communities.Curr Protoc Microbiol. 2012; (Chapter 1:Unit 1E 5)PubMed Google Scholar, 14Kozich J.J. Westcott S.L. Baxter N.T. et al.Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform.Appl Environ Microbiol. 2013; 79: 5112-5120Crossref PubMed Scopus (7) Google Scholar The most common approaches rely on microbial ecologic techniques, such as diversity analysis and ordination15Caporaso J.G. Kuczynski J. Stombaugh J. et al.QIIME allows analysis of high-throughput community sequencing data.Nat Methods. 2010; 7: 335-336Crossref PubMed Scopus (969) Google Scholar, 16Schloss P.D. Westcott S.L. Ryabin T. et al.Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities.Appl Environ Microbiol. 2009; 75: 7537-7541Crossref PubMed Scopus (1743) Google Scholar; the potential imprecisions of 16S-based microbial classification and the associated sequence data have made it difficult to precisely identify microbes that might cause specific diseases, although this is improving.11Jumpstart Consortium Human Microbiome Project Data Generation Working GroupEvaluation of 16S rDNA-based community profiling for human microbiome research.PLoS One. 2012; 7: e39315Crossref PubMed Scopus (25) Google Scholar, 12Degnan P.H. Ochman H. Illumina-based analysis of microbial community diversity.ISME J. 2012; 6: 183-194Crossref PubMed Scopus (22) Google Scholar, 17Quince C. Lanzen A. Davenport R.J. et al.Removing noise from pyrosequenced amplicons.BMC Bioinformatics. 2011; 12: 38Crossref PubMed Scopus (263) Google Scholar Increasingly, 16S-based analyses of intestinal microbiomes of patients with specific disorders rely on reproducible diagnostic and prognostic biomarker discovery methods, which were developed from molecular epidemiology approaches in gene expression and genetic studies.18Willing B.P. Dicksved J. Halfvarson J. et al.A pyrosequencing study in twins shows that gastrointestinal microbial profiles vary with inflammatory bowel disease phenotypes.Gastroenterology. 2010; 139: 1844-1854.e1Abstract Full Text Full Text PDF PubMed Scopus (158) Google Scholar, 19Joossens M. Huys G. Cnockaert M. et al.Dysbiosis of the faecal microbiota in patients with Crohn's disease and their unaffected relatives.Gut. 2011; 60: 631-637Crossref PubMed Scopus (96) Google Scholar, 20Segata N. Izard J. Waldron L. et al.Metagenomic biomarker discovery and explanation.Genome Biol. 2011; 12: R60Crossref PubMed Scopus (64) Google Scholar Although 16S sequencing is the most widely used platform for studies of the gut microbiome because of its low cost (Table 1), it has several notable limitations. First, its accuracy depends on whether the observed proportions of 16S gene sequences reflect the proportion of bacteria in the sample, but the 16S gene is subject to copy number variation, as well as PCR primer and amplification bias. These biases range from being relatively minor to preventing detection of entire species or clades.21Lee C.K. Herbold C.W. Polson S.W. et al.Groundtruthing next-gen sequencing for microbial ecology-biases and errors in community structure estimates from PCR amplicon pyrosequencing.PLoS One. 2012; 7: e44224Crossref PubMed Scopus (17) Google Scholar, 22Turnbaugh P.J. Quince C. Faith J.J. et al.Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins.Proc Natl Acad Sci U S A. 2010; 107: 7503-7508Crossref PubMed Scopus (149) Google Scholar Furthermore, although 16S sequencing provides information about the overall composition of the microbial community, it does not provide information about the genomes of its members or their functions. For communities with many available reference genomes, such as the human intestine, it is possible to infer an approximate metagenome using methods such as PICRUSt.23Langille M.G. Zaneveld J. Caporaso J.G. et al.Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences.Nat Biotechnol. 2013; 31: 814-821Crossref PubMed Scopus (4) Google Scholar These methods couple functions of gene products encoded by the most closely related sequenced genomes with observed taxonomic profiles to produce a functional profile.Table 1Microbiome Sequencing Approaches: Costs and CaveatsMicrobial data of interestRelevant technologiesCurrent approximate cost (US$/sample)NotesBacterial/archaeal composition16S on stool or mucosa$10–$100Primer bias; see recommendations such as 515F/806R96Kuczynski J. Lauber C.L. Walters W.A. et al.Experimental and analytical tools for studying the human microbiome.Nat Rev Genet. 2012; 13: 47-58Crossref Scopus (78) Google ScholarFungal compositionInternal transcribed spacer analysis of stool or mucosa$10–$100Fungi frequently are difficult to lyse; use of a bead-beater and optimization of DNA extraction protocols is recommended97Henderson G. Cox F. Kittelmann S. et al.Effect of DNA extraction methods and sampling techniques on the apparent structure of cow and sheep rumen microbial communities.PLoS One. 2013; 8: e74787Crossref PubMed Google Scholar, 98Borman A.M. Palmer M. Johnson E.M. Rapid methods for the extraction and archiving of molecular grade fungal genomic DNA.Methods Mol Biol. 2013; 968: 55-62Crossref PubMed Google ScholarBacterial/archaeal/viral/fungal composition and functionWMS on stool$100–$500WMS on mucosa will have extensive host contamination, but WMS on stool typically contains <1% host readsViral composition and functionWMS on stool DNA or RNA prepared from complementary DNA, after enrichment for viral fraction$100–$500Viral DNA can range from <0.1% to a significant fraction of stool samples; enrichment before sequencing can improve detection consistency29Reyes A. Haynes M. Hanson N. et al.Viruses in the faecal microbiota of monozygotic twins and their mothers.Nature. 2010; 466: 334-338Crossref PubMed Scopus (178) Google Scholar, 99Thurber R.V. Haynes M. Breitbart M. et al.Laboratory procedures to generate viral metagenomes.Nat Protoc. 2009; 4: 470-483Crossref PubMed Scopus (96) Google ScholarBacterial/archaeal/viral/fungal transcriptomeWMS on stool complementary DNA prepared from RNA$200–$1000Mucosa will have extensive host material and low bacterial biomass is likely; stool is substantially technically simpler; depletion of rRNA is crucialHost genomics, transcriptomics, proteomics, or immunologySample from mucosaVariesMucosal biopsy specimens can be paired with stool or multiplexed to joint host and microbial assays for parallel multi'omic data Open table in a new tab Most importantly, 16S sequencing identifies only bacterial components of a community—not other types of microbes. However, amplicon approaches, which incorporate the 18S ribosomal subunit gene and internal transcribed spacer sequences for analysis of eukaryotes, are rapidly improving.24Nilsson R.H. Kristiansson E. Ryberg M. et al.Intraspecific ITS variability in the kingdom fungi as expressed in the international sequence databases and its implications for molecular species identification.Evol Bioinform Online. 2008; 4: 193-201PubMed Google Scholar, 25Hoffmann C. Dollive S. Grunberg S. et al.Archaea and fungi of the human gut microbiome: correlations with diet and bacterial residents.PLoS One. 2013; 8: e66019Crossref PubMed Scopus (1) Google Scholar Now that the cost difference has narrowed between 16S and whole-metagenome or metatranscriptome sequencing (WMS) (Table 1), it is becoming feasible to design 2-stage experiments that incorporate both approaches26Tickle T.L. Segata N. Waldron L. et al.Two-stage microbial community experimental design.ISME J. 2013; 7: 2330-2339Crossref PubMed Google Scholar in a cost-effective manner, or even exclude 16S analysis. Metagenome or metatranscriptome sequencing, also referred to as shotgun sequencing, DNA-seq, or RNA-seq, is the process of sequencing the entire nucleotide pool isolated from a culture-independent sample. When DNA from a human sample such as stool is sequenced by metagenomic methods, the sample typically includes a small amount ( 99%); shotgun DNA and RNA approaches are viable25Hoffmann C. Dollive S. Grunberg S. et al.Archaea and fungi of the human gut microbiome: correlations with diet and bacterial residents.PLoS One. 2013; 8: e66019Crossref PubMed Scopus (1) Google Scholar, 100Walter J. Ley R. The human gut microbiome: ecology and recent evolutionary changes.Ann Rev Microbiol. 2011; 65: 411-429Crossref PubMed Scopus (75) Google Scholar, 101Sommer F. Backhed F. The gut microbiota–masters of host development and physiology.Nat Rev Microbiol. 2013; 11: 227-238Crossref PubMed Scopus (59) Google Scholar, 103Aguirre de Carcer D. Cuiv P.O. Wang T. et al.Numerical ecology validates a biogeographical distribution and gender-based effect on mucosa-associated bacteria along the human colon.ISME J. 2011; 5: 801-809Crossref PubMed Scopus (8) Google ScholarMucosa-Small intestine: TM7, Enterobacteriaceae, Pasteurellaceae, Lachnospiraceae, Eubacterium, Streptococcus, FaecalibacteriumLarge intestine: Bacteroidetes, Firmicutes, Actinobacteria, Fusobacteria, AkkermansiaShotgun sequencing will yield large numbers of host reads; joint host and microbial data readily obtained101Sommer F. Backhed F. The gut microbiota–masters of host development and physiology.Nat Rev Microbiol. 2013; 11: 227-238Crossref PubMed Scopus (59) Google Scholar, 104Kuehbacher T. Rehman A. Lepage P. et al.Intestinal TM7 bacterial phylogenies in active inflammatory bowel disease.J Med Microbiol. 2008; 57: 1569-1576Crossref PubMed Scopus (21) Google Scholar, 105Cecchini D.A. Laville E. Laguerre S. et al.Functional metagenomics reveals novel pathways of prebiotic breakdown by human gut bacteria.PLoS One. 2013; 8: e72766Crossref PubMed Scopus (2) Google Scholar, 106Baumgart M. Dogan B. Rishniw M. et al.Culture independent analysis of ileal mucosa reveals a selective increase in invasive Escherichia coli of novel phylogeny relative to depletion of Clostridiales in Crohn's disease involving the ileum.ISME J. 2007; 1: 403-418Crossref PubMed Scopus (139) Google Scholar, 107McHardy I. Goudarzi M. Tong M. et al.Integrative analysis of the microbiome and metabolome of the human intestinal mucosal surface reveals exquisite inter-relationships.Microbiome. 2013; 1: 17Crossref PubMed Google Scholar Open table in a new tab Species and strains of microbes also easily can be identified using WMS data, although 16S studies often are unable to provide confident species-level resolution of taxa.17Quince C. Lanzen A. Davenport R.J. et al.Removing noise from pyrosequenced amplicons.BMC Bioinformatics. 2011; 12: 38Crossref PubMed Scopus (263) Google Scholar, 35Vetrovsky T. Baldrian P. The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses.PLoS One. 2013; 8: e57923Crossref PubMed Scopus (11) Google Scholar This is particularly relevant to gut microbiology studies. For example, the 16S sequences of some benign Clostridium species (eg, C irregulare and C bartlettii) are similar those of the potentially problematic C difficile. Likewise, Escherichia and Shigella cannot be distinguished reliably by 16S analyses.36Christensen H. Nordentoft S. Olsen J.E. Phylogenetic relationships of Salmonella based on rRNA sequences.Int J Syst Bacteriol. 1998; 48: 605-610Crossref PubMed Google Scholar In contrast, taxonomies can be resolved at the species level using tools that include MetaPhlAn,37Segata N. Huttenhower C. Toward an efficient method of identifying core genes for evolutionary and functional microbial phylogenies.PLoS One. 2011; 6: e24704Crossref PubMed Scopus (7) Google Scholar PhymmBL,38Brady A. Salzberg S.L. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models.Nat Methods. 2009; 6: 673-676Crossref PubMed Scopus (128) Google Scholar Ray Communities,39Boisvert S. Raymond F. Godzaridis E. et al.Ray Meta: scalable de novo metagenome assembly and profiling.Genome Biol. 2012; 13: R122Crossref PubMed Scopus (10) Google Scholar and other approaches. Combinations of genetic markers,40Segata N. Bornigen D. Morgan X.C. et al.PhyloPhlAn is a new method for improved phylogenetic and taxonomic placement of microbes.Nat Commun. 2013; 4: 2304Crossref PubMed Scopus (6) Google Scholar single-nucleotide polymorphisms,41Schloissnig S. Arumugam M. Sunagawa S. et al.Genomic variation landscape of the human gut microbiome.Nature. 2013; 493: 45-50Crossref PubMed Scopus (48) Google Scholar or assembled contigs42Nijkamp J.F. Pop M. Reinders M.J. et al.Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold.Bioinformatics. 2013; 29: 2826-2834Crossref PubMed Google Scholar further distinguish between strains of the same species. With any of these taxonomic profiling methods, most of which now are computationally straightforward, WMS can be used in place of almost any 16S ecologic or compositional analysis. WMS provides the added benefits of decreased bias, cross-kingdom sensitivity, and increased specificity. In addition to higher-resolution descriptions of the compositions of communities, WMS provides information about its functional capabilities. However, this is a more complex process, which has additional computational challenges, relative to taxonomic profiling. The first choice that must be made is whether to assemble a WMS data set into contigs before functional analysis or whether to annotate reads directly, without assembly. Both options have benefits and drawbacks. Assembled contigs make it possible to study genes in the context of their neighbor genes, which is vital for identifying genomic rearrangements and can be helpful in determining functions regulated by uncharacterized genes. Second, larger contigs are more likely to contain entire genes, including the transcriptional stop and start sites, which are amenable to studies of regulatory networks and more tractable for runs through gene-discovery programs. However, assembly is also a challenging computational problem, and given a metagenome from an environment as complex as the human intestine, it often is impossible to assemble very large genome fragments except from the most highly abundant members. To derive the largest contigs possible, reads may be digitally normalized or partitioned to make them more tractable to assembly.43Howe AC, Jansson J, Malfatti SA, et al. Assembling large, complex environmental metagenomes. http://arxiv:1212.2832.Google Scholar, 44Sharon I. Morowitz M.J. Thomas B.C. et al.Time series community genomics analysis reveals rapid shifts in bacterial species, strains, and phage during infant gut colonization.Genome Res. 2013; 23: 111-120Crossref PubMed Scopus (20) Google Scholar Or, assemblers specifically built for metagenomes with varied species abundances can be applied directly (eg, Ray Meta,39Boisvert S. Raymond F. Godzaridis E. et al.Ray Meta: scalable de novo metagenome assembly and profiling.Genome Biol. 2012; 13: R122Crossref PubMed Scopus (10) Google Scholar MetaVelvet,45Namiki T. Hachiya T. Tanaka H. et al.MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads.Nucleic Acids Res. 2012; 40: e155Crossref PubMed Scopus (30) Google Scholar or MetaIDBA46Peng Y. Leung H.C. Yiu S.M. et al.Meta-IDBA: a de novo assembler for metagenomic data.Bioinformatics. 2011; 27: i94-i101Crossref PubMed Scopus (42) Google Scholar). After metagenomic assembly, contig analysis proceeds much as it would for single microbial genomes, with gene discovery47Delcher A.L. Bratke K.A. Powers E.C. et al.Identifying bacterial genes and endosymbiont DNA with Glimmer.Bioinformatics. 2007; 23: 673-679Crossref PubMed Scopus (593) Google Scholar, 48Besemer J. Lomsadze A. Borodovsky M. GeneMarkS: a self-training method for predict

Referência(s)
Altmetric
PlumX