Revisão Acesso aberto Revisado por pares

Seeing the PDB

2021; Elsevier BV; Volume: 296; Linguagem: Inglês

10.1016/j.jbc.2021.100742

ISSN

1083-351X

Autores

Jane S. Richardson, David Richardson, David S. Goodsell,

Tópico(s)

Enzyme Structure and Function

Resumo

Ever since the first structures of proteins were determined in the 1960s, structural biologists have required methods to visualize biomolecular structures, both as an essential tool for their research and also to promote 3D comprehension of structural results by a wide audience of researchers, students, and the general public. In this review to celebrate the 50th anniversary of the Protein Data Bank, we present our own experiences in developing and applying methods of visualization and analysis to the ever-expanding archive of protein and nucleic acid structures in the worldwide Protein Data Bank. Across that timespan, Jane and David Richardson have concentrated on the organization inside and between the macromolecules, with ribbons to show the overall backbone "fold" and contact dots to show how the all-atom details fit together locally. David Goodsell has explored surface-based representations to present and explore biological subjects that range from molecules to cells. This review concludes with some ideas about the current challenges being addressed by the field of biomolecular visualization. Ever since the first structures of proteins were determined in the 1960s, structural biologists have required methods to visualize biomolecular structures, both as an essential tool for their research and also to promote 3D comprehension of structural results by a wide audience of researchers, students, and the general public. In this review to celebrate the 50th anniversary of the Protein Data Bank, we present our own experiences in developing and applying methods of visualization and analysis to the ever-expanding archive of protein and nucleic acid structures in the worldwide Protein Data Bank. Across that timespan, Jane and David Richardson have concentrated on the organization inside and between the macromolecules, with ribbons to show the overall backbone "fold" and contact dots to show how the all-atom details fit together locally. David Goodsell has explored surface-based representations to present and explore biological subjects that range from molecules to cells. This review concludes with some ideas about the current challenges being addressed by the field of biomolecular visualization. Across most of the history of the PDB, we have shared a common passion for understanding and communicating the beauty and functional complexity of macromolecular 3D structures. Our approaches to this endless and satisfying challenge differ in a fundamental way, but because of that are highly complementary. Goodsell is fascinated with creating intuitive representations of what a single protein or a set of interacting molecules would look like, at a range of size scales, if it were possible to see their surfaces directly. The Richardsons are fascinated with how to show the internal organization—the connected backbone "fold" in 3D and the specific atom–atom contacts that determine the anatomy and individuality of the molecule, or the amazing specificity of interaction with another molecule. To celebrate the 50th anniversary of the Protein Data Bank (PDB) archive, we will give personal tours of our own experiences in seeing the PDB. In the beginning, even before the PDB, visualizations of proteins in scientific reports were mostly either smoothed blobs of density to show helices and connectivity or full-detail Watson–Kendrew brass stick models (1Perutz M.F. Rossmann M.G. Cullis A.F. Muirhead H. Will G. North A.C. Structure of haemoglobin: A three-dimensional Fourier synthesis at 5.5-A. resolution, obtained by X-ray analysis.Nature. 1960; 185: 416-422Crossref PubMed Scopus (697) Google Scholar, 2Kendrew J.C. Dickerson R.E. Strandberg B.E. Hart R.G. Davies D.R. Phillips D.C. Shore V.C. Structure of myoglobin: A three-dimensional Fourier synthesis at 2 A. resolution.Nature. 1960; 185: 422-427Crossref PubMed Scopus (820) Google Scholar), photographed from the physically built models, and limited to black and white. The professional artist and scientific illustrator who first turned that information into appealing, convincingly understandable drawings was Irving Geis. Figure 1A shows Geis' painting of the myoglobin model at 2-Å resolution done for Scientific American (3Kendrew J.C. The three-dimensional structure of a protein molecule.Sci. Am. 1961; 205: 96-110Crossref PubMed Google Scholar), which conveys the depth cueing and 3D relationships, atom types, H-bonds, and even hydrogens much more effectively than a photograph of the actual brass model. Figure 1B shows his representation of the deoxyhemoglobin tetramer (4Dickerson R.E. Structure and Action of Proteins. Addison-Wesley, Boston, MA1969Google Scholar), with the numbered Cα trace inside a partly transparent, smoothed tube of the low-resolution electron density map. He also drew oxyhemoglobin separately, but since animation was not possible on the printed page, he could only suggest the large conformational change by outlining in red the smaller central opening in the oxy form. Those myoglobin and hemoglobin structures were what motivated Dave Richardson to take on solving the structure of Staphylococcal nuclease as his PhD project in inorganic (!) chemistry with Al Cotton at MIT, where Jane then joined the group as a technician. Stories about our punch-card days in the 1960s can be found in (5Richardson J.S. Richardson D.C. Studying and polishing the PDB's macromolecules.Biopolymers. 2013; 99: 170-182Crossref PubMed Scopus (8) Google Scholar, 6Richardson J.S. Richardson D.C. Doing molecular biophysics: Finding, naming, and picturing signal within complexity.Annu. Rev. Biophys. 2013; 42: 1-28Crossref PubMed Scopus (10) Google Scholar). Later, when Chris Anfinsen persuaded Jane to take on the task of drawing and classifying all the extant protein structures for Advances in Protein Chemistry (7Richardson J.S. The anatomy and taxonomy of protein structure.Adv. Protein Chem. 1981; 34: 167-339Crossref PubMed Scopus (2976) Google Scholar), she got a number of useful tips from Geis, such as showing backbone and side chain in different colors, and that sometimes inconsistent representations can work better: e.g., Geis' full-bond red for backbone carbonyls and all of the heme versus half-bond colors elsewhere (see Fig. 1A) and Jane's thick β ribbons versus thin α ribbons versus round-rope loops (see ribbon figures below). In the other direction, Geis later made a painting for his friend Fred Richards that included a moonlit ribonuclease S in Jane's ribbon style (8Lillie B. What Wikipedia taught me about my grandfather.Atlantic Monthly. 2014; Google Scholar). His ribonuclease by itself can be seen in the digital Geis archive at the RCSB (http://pdb101.rcsb.org/sci-art/geis-archive/gallery/rcsb-0002-ribonuclease-s), and the entire painting can be seen in (8Lillie B. What Wikipedia taught me about my grandfather.Atlantic Monthly. 2014; Google Scholar), by Richards' grandson Ben Lillie who owns the painting. When Dave and Jane Richardson built a brass model of their Staphylococcal nuclease structure (1sns) (9Arnone A. Bier C.J. Cotton F.A. Day V.W. Hazen E.E. Richardson D.C. Yonath A. Richardson J.S. A high resolution structure of an inhibitor complex of the extracellular nuclease of Staphylococcus aureus. I. Experimental procedures and chain tracing.J. Biol. Chem. 1971; 246: 2302-2316Abstract Full Text PDF PubMed Google Scholar), they were struggling to show the backbone "fold" clearly. Chris Anfinsen, for whom the nuclease was a disulfide-free model system to study protein folding, suggested we tie 1/4" tygon tubing along the backbone, fill it with fluorescent dye, and view it under UV light. (Fig. 2, A and B). The result looks much like the earliest computer graphics. Jane made a primitive "worm" drawing from that UV-lit image (Fig. 2C), later done in a form more transitional toward ribbons for a few small structures such as insulin (Fig. 2D), and Dave learned to draw a worm of the hemoglobin tetramer on the chalkboard for his class. Those were precursors to the ribbon drawings, further inspired by analogy with MC Escher's "ribbon" or "rind" drawings of human heads that alternate a spiral strip of surface with an equal opening, cleverly placed to include just enough of eye, nose, mouth, hair, and shoulders so that you perceive the face as intact and also see the back side of the head (see explanation in (10Richardson J.S. Art and analogy help scientists "see" big biological molecules in 3D.SciArt Mag. 2018; Google Scholar), or just google "MC Escher ribbon head"). Jane worked until she could see, and then hopefully make viewers see, her β ribbons as continuously H-bonded β sheets, not as a handful of separate arrows. Not being even an amateur artist, Jane spent an entire year developing effective conventions for the ribbons, practicing how to draw them, and drawing the 75 distinct protein domain structures then known, plus another year writing the long review article they illustrate (7Richardson J.S. The anatomy and taxonomy of protein structure.Adv. Protein Chem. 1981; 34: 167-339Crossref PubMed Scopus (2976) Google Scholar). Fortunately, she was then a nearly invisible Associate whose time was not closely scrutinized except by Dave. She had built or helped build the initial models for four proteins (1sns staphylococcal nuclease, 1gch γ-chymotrypsin, 1sod/2sod Cu,Zn superoxide dismutase, and 1ebx erabutoxin B), and so she was able to visualize in her head the peptide orientations and H-bonding from stereo images of the Cα-trace. The pioneering computer graphics system in Richard Feldman's laboratory at National Institutes of Health in Bethesda allowed interactive choice of viewing direction for a Cα-trace, mono printout at a consistent scale, plus printout of small Cα stereo pairs. Jane drove up there often from Duke, displaying coordinates of different structures either from the PDB or from Feldman's microfiche atlas (11Feldman R.J. AMSOM: Atlas of Macromolecular Structure on Microfiche. Tracor Jitco, Inc, Rockville, MD1976Google Scholar) and bringing home the printouts. The ribbon-drawing conventions for α helix, β sheet, and nonrepetitive loops are shown in Figure 3A, with the Cα-trace behind for the loop case; Figure 3B shows the pencil sketch for Staphylococcal nuclease on tracing paper over the mono Cα printout; each such sketch involved much trial-and-error erasing and redrawing until it looked right. The final pen-and-ink drawing was done on heavy tracing paper over the pencil drawing, then photographed by Dave to make the high-contrast, touched-up negative used for production of the review article (Fig. 3D shows a page of similar Greek-key fold structures) and distributed by us as a coloring book. Looking right also involves allowing for perceptual illusions caused by binocular vision, such as seeing both sides of an edge-on ribbon at once and seeing a bit under each side of the front strand when a strand behind crosses at a low angle (Fig. 3C); both of those cases occur in the drawing of our Cu,Zn superoxide dismutase ("SOD"; (12Richardson J.S. Thomas K.A. Richardson D.C. Alpha-carbon coordinates for bovine Cu,Zn superoxide dismutase.Biochem. Biophys. Res. Commun. 1975; 63: 986-992Crossref PubMed Scopus (59) Google Scholar)), shown in Figure 3D. The first shaded versions were done with stick-on plastic cut to fit with an exacto knife, in halftone dots for black and white and in color for slides. Stick figures were often added for ligands or for critical sidechains. Dave's mother, a professional artist (see (13Chuse A.R. Costume Design. Bridgman Publishing, Pelham, NY1935Google Scholar) for an example of her work), encouraged Jane to try other media: pastels, which are wonderful for enlarged versions but were done only for TIM (Fig. 4A; 1tim; (14Banner D.W. Bloomer A.C. Petsko G.A. Phillips D.C. Wilson I.A. Atomic coordinates for triose phosphate isomerase from chicken muscle.Biochem. Biophys. Res. Commun. 1976; 72: 146-155Crossref PubMed Scopus (8) Google Scholar)) and for SOD (Fig. 3D) because it is so hard to keep from smudging them; scratchboard (Fig. 4B); stained glass; and even a small sculpture of glued-together pieces cut from a 1 ¼-in diameter wooden closet pole with our miter-bevel saw (Fig. 4C), which made the cover of Biophysical Journal (15Richardson J.S. Richardson D.C. Tweedy N.B. Gernert K.M. Quinn T.P. Hecht M.H. Erickson B.W. Yan Y. McClain R.D. Donlan M.E. Looking at proteins: Representations, folding, packing, and design. Biophysical Society National Lecture, 1992.Biophys. J. 1992; 63: 1185-1209Abstract Full Text PDF PubMed Google Scholar). Other people also make ribbon sculptures, such as Byron Rubin (http://5reed.edu/reed-magazine/articles/2019/byron-rubin-protein-sculptor.html), Julian Voss-Andrea (http://julianvossandrea.com), and Bathsheba Grossman (https://bathsheba.com/crystal/). A recent Duke Library exhibit (https://exhibits.library.duke.edu/exhibits/show/invisible/case04) about our work showed that ribbon drawings go surprisingly well with gothic arches (Fig. 4D). Many of our hand-drawn ribbons and our computer graphics images are available with open license on Wikimedia Commons at https://commons.wikimedia.org/wiki/User:Dcrjsr. The TIM barrel drawing (Fig. 4A) was Wikipedia Picture of the Day on November 19, 2009. More information and links are available on Wikipedia pages for Ribbon diagram, Kinemage, and Jane S. Richardson. Our first exposure to interactive use of computer graphics for crystallography itself was the graphics laboratory in Fred Brooks' computer science department at UNC Chapel Hill. The user console of GRIP-75 (Fig. 5A) had both stereo and smooth rotation, and a wonderful array of five knob, slider, and "toothpick" controls, the first computer system capable of fitting an atomic model into an electron density map (16Britton E.G. Lipscomb J.S. Pique M.E. Making nested rotations convenient for the user.SIGGRAPH Comput. Graph. 1978; 12: 222-227Crossref Scopus (36) Google Scholar). The calculations were done on an IBM360 that filled the room behind. We were guinea pigs in its development, and our 2-Å resolution structure of Cu,Zn superoxide dismutase (2sod; "SOD") was the first protein crystallographic model built on computer graphics before building a physical model (17Richardson D.C. The three-dimensional structure of Cu,Zn superoxide dismutase.in: Michelson McCord Fridovich Superoxide and Superoxide Dismutase. Academic Press, Cambridge, MA1977: 217-223Google Scholar, 18Tainer J.A. Getzoff E.D. Beem K.M. Richardson J.S. Richardson D.C. Determination and analysis of the 2 Å structure of copper, zinc superoxide dismutase.J. Mol. Biol. 1982; 160: 181-217Crossref PubMed Scopus (884) Google Scholar). Figure 5B is a snapshot from the GRIP-75 screen, showing the single SS bond in SOD. A parade of users followed until GRIP-75 was gradually superseded by Frodo (19Jones T.A. A graphics model building and refinement system for macromolecules.J. Appl. Crystallogr. 1978; 11: 268-272Crossref Google Scholar), which could run on hardware accessible to individual laboratories. We stayed on for 20 years, providing driving problems for the UNC graphics laboratory and collaborative graphics projects. That was enormous fun and taught us a great deal about making our own graphics systems. When we got our own Evans&Sutherland PS300 calligraphic display, people were starting to try out a variety of ribbon-like representations, but most were too literal and choppy. The first version we really liked was in Mike Carson's Ribbons program (20Carson M. Bugg C.E. Algorithm for ribbon models of proteins.J. Mol. Graph. 1986; 4: 121-122Crossref Google Scholar). He gave us his algorithm for smooth B-splines with offsets for helix and turn radius, and Dave coded it as multistrand vector ribbons such as the intertwined dimer of Trp repressor in Figure 6A. Then in 1990 we realized that the first Apple home desktop computer could smoothly rotate 500 vectors—enough to show interesting molecular visualizations in 3D or simple animations of conformational change, accessible to anyone. On a hiking trip in the Sierras we worked out the framework for "kinemage" format, including that it is human editable and that any display point clicked on will tell you its identity. Dave then wrote the Mage program to display kinemages, with many interactive viewing and even editing options (21Richardson D.C. Richardson J.S. The kinemage: A tool for scientific communication.Protein Sci. 1992; 1: 3-9Crossref PubMed Google Scholar); about the same time RasMol was also developed for home computers (22Sayle R.A. Milner-White E.J. RASMOL: Biomolecular graphics for all.Trends Biochem. Sci. 1995; 20: 374Abstract Full Text PDF PubMed Scopus (2258) Google Scholar). Kinemages became a major part of the then-revolutionary electronic supplement to the new Protein Science journal, distributed each month on 4K diskettes. The first examples were very simple: an active site or overall Cαs, such as animating the hinge motions of T4 lysozyme domains to illustrate a paper in that journal issue (Fig. 6B). As the capabilities of personal computers rapidly evolved, larger models and even ribbons could be shown interactively (Fig. 6C). These days, ribbon graphics are limited in size and complexity mainly by what can be usefully perceived by the viewer, rather than by what can be calculated and displayed interactively, and other new representations have taken over for very large molecules, complexes, and even whole cells (see the Goodsell perspective). A gallery of some favorite mid-size ribbons of proteins and complexes is shown in Figure 7, A–F, from a small trimer to protein–DNA complexes and membrane proteins. The first two of these were created in Mage, and the rest in KiNG (23Davis I.W. Murray L.W. Richardson J.S. Richardson D.C. MOLPROBITY: Structure validation and all-atom contact analysis for nucleic acids and their complexes.Nucleic Acids Res. 2004; 32: W615-W619Crossref PubMed Scopus (755) Google Scholar, 24Chen V.B. Davis I.W. Richardson D.C. KING (Kinemage, Next Generation): A versatile interactive molecular and scientific visualization program.Protein Sci. 2009; 18: 2403-2409Crossref PubMed Scopus (118) Google Scholar), a Java kinemage viewer for online as well as local use and with many additional editing and remodeling features. An interesting sideline is that the kinemage format is not limited to molecules but is expressed as geometric primitives (line, point, triangle, etc.). Kinemages have been made for social networks, food webs, geodesic domes, a map of the world, etc. One example plotted measurements of a vulture flight path in 3D, which unexpectedly showed that vultures do not actually circle but fly in squared-off patterns parallel and perpendicular to wind direction, just as a small plane would. After working on early protein design, which could then achieve correct secondary structures and topologies but not well-ordered native-like structure (15Richardson J.S. Richardson D.C. Tweedy N.B. Gernert K.M. Quinn T.P. Hecht M.H. Erickson B.W. Yan Y. McClain R.D. Donlan M.E. Looking at proteins: Representations, folding, packing, and design. Biophysical Society National Lecture, 1992.Biophys. J. 1992; 63: 1185-1209Abstract Full Text PDF PubMed Google Scholar), we realized there was one of the inherent problems we could work to overcome: that if we changed from the standard of implicit "united-atom" treatment to including all explicit H atoms, and then paid attention to their contacts, we could provide a realistic, quantifiable, and visualizable analysis of detailed atom–atom internal packing. The atom–atom contact dots we use are more or less the inverse of Connolly dots that outline the accessible surface of molecules (25Connolly M. Solvent-accessible surfaces of proteins and nucleic acids.Science. 1983; 221: 709-713Crossref PubMed Google Scholar). Our Reduce program places all explicit H atoms and optimizes their rotations and interaction networks for H-bonds, van der Waals (vdW), and clashing overlaps (26Word J.M. Lovell S.C. LaBean T.H. Taylor H.C. Zalis M.E. Presley B.K. Richardson J.S. Richardson D.C. Visualizing and quantifying molecular goodness-of-fit: Small-probe contact dots with explicit hydrogen atoms.J. Mol. Biol. 1999; 285: 1711-1733Crossref PubMed Scopus (437) Google Scholar). This all-atom contact analysis visualizes how the atomic surfaces interact, rather than traditional energetics that is pairwise from atom centers and is only numerical, not visual. Figure 8A shows the undulating, interdigitated dots for vdW contacts around the H atoms on a proline sidechain at high resolution (1ezm; (27Thayer M.M. Flaherty K.M. McKay D.B. Three-dimensional structure of the elastase of Pseudomonas aeruginosa at 1.5-A resolution.J. Biol. Chem. 1991; 266: 2864-2871Abstract Full Text PDF PubMed Google Scholar)). Green dots are for perfect atom–atom contact, blue for very close but not quite touching, and yellow for just a bit tight. Figure 8B shows a transition-state inhibitor held very tightly in ribonuclease (1ruv; (28Ladner J.E. Wladkowski B.D. Svensson L.A. Sjölin L. Gilliland G.L. X-ray structure of a ribonuclease A-uridine vanadate complex at 1.3 A resolution.Acta Crystallogr. D Biol. Crystallogr. 1997; 53: 290-301Crossref PubMed Scopus (47) Google Scholar)) by 13 good H-bonds (pillows of pale green dots). Figure 8C shows the base-stacking vdW contacts and base-pair H-bonds in an RNA double helix of a mutant G riboswitch at 1.35-Å resolution (4fen; (29Stoddard C.D. Widmann J. Trausch J.J. Marcano-Velázquez J.G. Knight R. Batey R.T. Nucleotides adjacent to the ligand-binding pocket are linked to activity tuning in the purine riboswitch.J. Mol. Biol. 2013; 425: 1596-1611Crossref PubMed Scopus (36) Google Scholar)). The all-atom contact system is confirmed by the fact that the well-ordered parts of high-resolution experimental structures have almost no clashes, good density of H-bonds, and interdigitated hydrogens to give excellent vdW contacts. That system also provided a new and powerful way to find misfit local conformations in the experimental models, primarily by "clashes," defined as unfavorable overlaps ≥0.4 Å. That discovery motivated our establishment of the MolProbity validation web service, discussed below. After the first ribosomes came out in 2000, our new student Laura Murray led us into revisiting our early fascination with RNA at MIT in the 1960s, when our friend Sung-Huo Kim solved the first structure of a complexly folded RNA (Fig. 9A), dramatically revising expectations about RNA structure beyond flat secondary-structure layouts such as the tRNA "cloverleaf." The ribosome added more than an order of magnitude more data, enabling a realistic analysis of the hard-to-see and inherently high-dimensional RNA backbone conformations. There are six variable dihedral angles in a PO4-to-PO4 RNA nucleotide, but we discovered that the seven-dihedral sugar-to-sugar suite parsing of the backbone shows better clustering because it includes more backbone interaction than the nucleotide as well as being directly influenced by the interaction of successive bases. After trying a system of cross-picking between two 3D windows, Dave in Mage and Vincent Chen in KiNG developed an integrated high-dimensional technique for displaying and analyzing that data. It works by defining "views" for specific triples of parameters, choosing and coloring the points in an apparent cluster (Fig. 9B), and then checking cluster validity in other 3D combinations and also in parallel coordinates (Fig. 9C). That work culminated in a collaboration of five different laboratories to agree on a consensus library of RNA backbone conformers with two-character names or !! (bang-bang) for outliers (30Richardson J.S. Schneider B. Murray L.W. Kapral G.J. Immormino R.M. Headd J.J. Richardson D.C. Ham D. Hershkovits E. Williams L.D. Keating K.S. Pyle A.M. Micallef D. Westbrook J. Berman H.M. et al.RNA backbone: Consensus all-angle conformers and modular string nomenclature (an RNA Ontology Consortium contribution).RNA. 2008; 14: 465-481Crossref PubMed Scopus (174) Google Scholar). The 7D graphics are for doing the research, whereas the results allow anyone to meaningfully name and visualize an RNA motif such as a stack switch, an S-motif, or a GNRA tetraloop (Fig. 9D). When combined with all-atom contacts, such an image is even more informative and can guide correction of RNA validation outliers. The DiVE (Duke immersive Virtual Environment) 3-m3 virtual reality lets us explore whether immersive virtual reality could be set up to enable molecular research as well as to enhance intuition through a compelling visual and interactive experience. Working with the Visualization Technology group at Duke, we jointly programmed KinImmerse display of the kinemage format in the DiVE, with atom selection, identification, and measurements (Fig. 10A; (31Block J.N. Zielinski D.J. Chen V.B. Davis I.W. Vinson E.C. Brady R. Richardson J.S. Richardson D.C. KinImmerse: Macromolecular VR for NMR ensembles.Source Code Biol. Med. 2009; 4: 3Crossref PubMed Scopus (16) Google Scholar)). Our test case was bringing to NMR structural ensembles the direct visual comparison of model to data that we rely on in crystallography by looking at the model within the electron density map. Basic NMR data measures very local atomic relationships: atom-pair distance by nuclear Overhauser effect (NOE) values and atom-pair orientation to the magnetic field by RDC values, and so is well suited to an aligned and zoomed-in perspective within the ensemble. We provided that perspective by a simple but novel tool called cocentering (on a specific atom type, which can be toured through the sequence) to then compare in each residue how well each model fits the relevant local data. NOE data are easily represented as a vector between the two specific atoms, but an RDC (residual dipolar coupling) measurement means the atom–atom orientation lies somewhere along a pair of curves. Those curves can be drawn around one of the atoms in the ensemble of models (such as the backbone N for an NH RDC), but that was never done until cocentering made it understandable. Figure 10B shows the elegant RDC curves around the cocentered N of Glu 36 in 2jng. In this case some of the models must be wrong, since the H atoms should all be on the same branch of the RDC curve. Which group of models is correct (the white ones) can be decided by other validation criteria such as H-bonding, clashes, and φ,ψ values. This system was later ported back to single-screen viewing, but might not have been thought of there. Our MolProbity web service adds other validation measures to the all-atom contacts (bond length, bond angle, and Cβ deviations, Ramachandran φ,ψ dihedrals, sidechain rotamers, and RNA sugar puckers and backbone conformers). In addition to sequence-based charts and summary statistics, it outputs visual markup for the local outliers on an interactive image of the 3D model. Each type of outlier has a distinct color and shape, and they stand out clearly on a simple Cα-backbone model (Fig. 11A). The user can then zoom in and overlay the electron density map to see the full details, especially for a cluster of multiple outliers close in 3D, such as the helix start in Figure 11B. MolProbity gradually became considered state of the art in model validation. However, model validation, if considered just as flagging problems, would be a very depressing and unpopular activity—what makes it positive and even exciting is the prospect of helping to fix many of those problems. The cluster of clashes and rotamer outliers in 1LpL (32Li S. Finley J. Liu Z.-J. Qiu S.-H. Chen H. Luan C.-H. Carson M. Tsao J. Johnson D. Lin G. Zhao J. Thomas W. Nagy L.A. Sha B. DeLucas L.J. et al.Crystal structure of the cytoskeleton-associated protein glycine-rich (CAP-Gly) domain.J. Biol. Chem. 2002; 277: 48596-48601Abstract Full Text Full Text PDF PubMed Scopus (83) Google Scholar) at upper left in Figure 11A, and expanded in Figure 11B along with the map and difference-density peaks, was diagnosed as an incorrect switch of backbone versus sidechain at the 136 Cα atom. Local rebuilding with the correct switch choice at 136 and alternate conformations for Lys 137 removed those outliers and enabled building an additional turn of α helix into the previously unoccupied density (Fig. 11C). Our proof-of-concept correction of this 1.77-Å structure lowered the Rwork by only 1% but the Rfree cross-validation measure by 4% (33Arendall W.B. Tempel W. Richardson J.S. Zhou W. Wang S. Davis I.W. Liu Z.-J. Rose J.P. Carson W.M. Luo M. Richardson D.C. Wang B.-C. A test of enhancing model accuracy in high-throughput crystallography.J. Struct. Funct. Genomics. 2005; 6: 1-11Crossref PubMed Scopus (47) Google Scholar), indicating that the model changes are true improvements: only two tiny clashes and a new sulfate replacing a tight cluster of waters (Fig. 11D). Over the past 20 years, this system of validations and visualizations has enabled both individual structural biologists and automated programs to significantly improve the quality of worldwide PDB (wwPDB) depositions at the most common resolutions from sub-1 to about 2.5 Å. Recently, however, both X-ray crystallography and most especially cryo-EM (34Chiu W. Downing K.H. Editorial overview: Cryo electron microscopy: Exciting advances in CryoEM Herald a new era in structural biology.Curr. Opin. Struct. Biol. 2017; 46: iv-viiiCrossref PubMed Scopus (7) Google Scholar) have succeeded in solving large numbers of huge, dynamic "molecular machines," mostly at resolutions poorer than 2.5 Å. These new structures make revolutionary contributions to our knowledge of biology, but that lower resolution introduces new systematic errors insensitive to traditional validation metrics.

Referência(s)