Identification and classification of papain-like cysteine proteinases
2023; Elsevier BV; Volume: 299; Issue: 6 Linguagem: Inglês
10.1016/j.jbc.2023.104801
ISSN1083-351X
AutoresFatih Ozhelvaci, Kamil Steczkiewicz,
Tópico(s)Enzyme Production and Characterization
ResumoPapain-like cysteine peptidases form a big and highly diverse superfamily of proteins involved in many important biological functions, such as protein turnover, deubiquitination, tissue remodeling, blood clotting, virulence, defense, and cell wall remodeling. High sequence and structure diversity observed within these proteins hinders their comprehensive classification as well as the identification of new representatives. Moreover, in general protein databases, many families already classified as papain like lack details regarding their mechanism of action or biological function. Here, we use transitive remote homology searches and 3D modeling to newly classify 21 families to the papain-like cysteine peptidase superfamily. We attempt to predict their biological function and provide structural characterization of 89 protein clusters defined based on sequence similarity altogether spanning 106 papain-like families. Moreover, we systematically discuss observed diversity in sequences, structures, and catalytic sites. Eventually, we expand the list of human papain–related proteins by seven representatives, including dopamine receptor–interacting protein 1 as potential deubiquitinase, and centriole duplication regulating CEP76 as retaining catalytically active peptidase-like domain. The presented results not only provide structure-based rationales to already existing peptidase databases but also may inspire further experimental research focused on peptidase-related biological processes. Papain-like cysteine peptidases form a big and highly diverse superfamily of proteins involved in many important biological functions, such as protein turnover, deubiquitination, tissue remodeling, blood clotting, virulence, defense, and cell wall remodeling. High sequence and structure diversity observed within these proteins hinders their comprehensive classification as well as the identification of new representatives. Moreover, in general protein databases, many families already classified as papain like lack details regarding their mechanism of action or biological function. Here, we use transitive remote homology searches and 3D modeling to newly classify 21 families to the papain-like cysteine peptidase superfamily. We attempt to predict their biological function and provide structural characterization of 89 protein clusters defined based on sequence similarity altogether spanning 106 papain-like families. Moreover, we systematically discuss observed diversity in sequences, structures, and catalytic sites. Eventually, we expand the list of human papain–related proteins by seven representatives, including dopamine receptor–interacting protein 1 as potential deubiquitinase, and centriole duplication regulating CEP76 as retaining catalytically active peptidase-like domain. The presented results not only provide structure-based rationales to already existing peptidase databases but also may inspire further experimental research focused on peptidase-related biological processes. Peptidases constitute a class of enzymes catalyzing peptide bond breakage in either proteins or other biologically relevant molecules. They are essentially indispensable for protein turnover, digestion, signaling, tissue remodeling, and many more. Peptidases adopt two major catalytic mechanisms involving a water molecule or hydrophilic residue to attack the peptide bond. Hence, the most general classification of peptidases is based on the very catalytic residue type: cysteine, serine, threonine, aspartate, and glutamate. A separate class of peptidases utilize metal ions for catalysis; those define a separate clade named metalloproteases. This classification has been detailed at the level of protein families and clans and compiled into today's peptidase reference, the MEROPS database. The currently known peptidase universe is organized according to the detectable similarity in sequence, and function, as well as based on experimental data. Each peptidase catalytic type may be scaffolded by a range of structural folds. For instance, cysteine peptidases may retain cysteine proteinase, caspase-like, sortase, eIF1-like, Ntn hydrolase–like, or trypsin-like serine protease folds. In other words, they may belong to multiple protein superfamilies. Furthermore, each peptidase-related superfamily may cover many protein families, which is especially the case for cysteine proteinases, zincin-like metallopeptidases, and trypsin-like serine proteases. On the other hand, the given superfamily may also group peptidases of multiple catalytic types, like trypsin-like proteases (cysteine and serine type) and Ntn hydrolase–like peptidases (cysteine, serine, and threonine), which might indicate a convergent evolution of catalytic sites. For such broader superfamilies, despite provided clear-cut classifications, it is still challenging to fully comprehend their diversity and rationally describe it in the light of performed biological functions. Papain-like cysteine proteinases, a subclass of cysteine peptidases, constitute a large and diverse superfamily of proteins from all kingdoms of life, including many viruses (1Page M.J. Di Cera E. Evolution of peptidase diversity.J. Biol. Chem. 2008; 283: 30010-30014Google Scholar, 2Rawlings N.D. Bateman A. Origins of peptidases.Biochimie. 2019; 166: 4-18Google Scholar). Papain, the archetypal member of the superfamily, is a protease extracted from the papaya plant used in native Indian countries for tendering uncooked meat, and it was one of the first discovered enzymes (3Martin S.H. Papaïn-digestion.J. Physiol. 1885; 5: 213-230Google Scholar). Papain-like enzymes act as endopeptidases, dipeptidases, and exopeptidases (carboxypeptidases and aminopeptidases) or cleave amide groups outside the protein's main chain, for example, glutaminases, acyltransferases, and peptidoglycan amidases. They may be secreted to play roles as tissue remodelers (4Vidak E. Javoršek U. Vizovišek M. Turk B. Cysteine cathepsins and their extracellular roles: shaping the microenvironment.Cells. 2019; 8: 264Google Scholar) or virulence/defense factors (5Peng C.W. Peremyslov V.V. Mushegian A.R. Dawson W.O. Dolja V.V. Functional specialization and evolution of leader proteinases in the family Closteroviridae.J. Virol. 2001; 75: 12153-12160Google Scholar, 6Jensen M.H. Otten H. Christensen U. Borchert T.V. Christensen L.L.H. Larsen S. et al.Structural and biochemical studies elucidate the mechanism of rhamnogalacturonan lyase from Aspergillus aculeatus.J. Mol. Biol. 2010; 404: 100-111Google Scholar) or function as deubiquitinases important for multiple regulatory pathways (7Lin S.-C. Chung J.Y. Lamothe B. Rajashankar K. Lu M. Lo Y.-C. et al.Molecular basis for the unique deubiquitinating activity of the NF-kappaB inhibitor A20.J. Mol. Biol. 2008; 376: 526-540Google Scholar). They function as accessory peptidases in multiple viruses (8Choi G.H. Pawlyk D.M. Nuss D.L. The autocatalytic protease p29 encoded by a hypovirulence-associated virus of the chestnut blight fungus resembles the potyvirus-encoded protease HC-Pro.Virology. 1991; 183: 747-752Google Scholar, 9Liang Y. Yao J. Gillam S. Rubella virus nonstructural protein protease domains involved in trans- and cis-cleavage activities.J. Virol. 2000; 74: 5412-5423Google Scholar, 10Snijder E.J. Wassenaar A.L. van Dinten L.C. Spaan W.J. Gorbalenya A.E. The arterivirus nsp4 protease is the prototype of a novel group of chymotrypsin-like enzymes, the 3C-like serine proteases.J. Biol. Chem. 1996; 271: 4864-4871Google Scholar) and are used as toxins by many bacterial species (11Ashida H. Sasakawa C. Suzuki T. A unique bacterial tactic to circumvent the cell death crosstalk induced by blockade of caspase-8.EMBO J. 2020; 39e104469Google Scholar, 12Gavin H.E. Beubier N.T. Satchell K.J.F. The effector domain region of the Vibrio vulnificus MARTX toxin confers biphasic epithelial barrier disruption and is essential for systemic spread from the intestine.PLoS Pathog. 2017; 13e1006119Google Scholar, 13Kitadokoro K. Kamitani S. Miyazawa M. Hanajima-Ozawa M. Fukui A. Miyake M. et al.Crystal structures reveal a thiol protease-like catalytic triad in the C-terminal region of Pasteurella multocida toxin.Proc. Natl. Acad. Sci. U. S. A. 2007; 104: 5139-5144Google Scholar, 14Zhang D. de Souza R.F. Anantharaman V. Iyer L.M. Aravind L. Polymorphic toxin systems: comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics.Biol. Direct. 2012; 7: 18Google Scholar). Some families also underwent significant expansions, for example, deubiquitinating enzymes, with paralogs highly specialized for particular substrates (15Komander D. Clague M.J. Urbé S. Breaking the chains: structure and function of the deubiquitinases.Nat. Rev. Mol. Cell Biol. 2009; 10: 550-563Google Scholar, 16Nijman S.M.B. Luna-Vargas M.P.A. Velds A. Brummelkamp T.R. Dirac A.M.G. Sixma T.K. et al.A genomic and functional inventory of deubiquitinating enzymes.Cell. 2005; 123: 773-786Google Scholar). Papain-like cysteine proteases share a common structural core classified in the SCOP database (17Murzin A.G. Brenner S.E. Hubbard T. Chothia C. SCOP: A structural classification of proteins database for the investigation of sequences and structures.J. Mol. Biol. 1995; 247: 536-540Google Scholar) as an α + β fold (d.3—cysteine proteinases) comprising an α-helix followed by antiparallel β-sheet composed of four/five β-strands (depending on database version) with 43215 topology (Fig. 1). The active site is located between the helix providing the catalytic cysteine residue located at its N terminus and β-strands 2 and 3 delivering histidine and a polar group, respectively. These two strands are also slightly bent away from each other creating a convenient pocket for aligning the incoming substrate for catalysis. A minimal canonical catalytic site of cysteine proteinases is formed by a cysteine–histidine dyad (18Barrett A.J. Rawlings N.D. Evolutionary lines of cysteine peptidases.Biol. Chem. 2001; 382: 727-733Google Scholar, 19Storer A.C. Ménard R. Catalytic mechanism in papain family of cysteine peptidases.Met. Enzymol. 1994; 244: 486-500Google Scholar). The thiol group of the cysteine acts as a nucleophile, whereas histidine acts as a base and forms reactive thiolate–imidazolium ion pair during the catalysis (20Rzychon M. Chmiel D. Stec-Niemczyk J. Modes of inhibition of cysteine proteases.Acta Biochim. Pol. 2004; 51: 861-873Google Scholar), although in some known proteins, the distance between histidine and cysteine is too big, which limits this effect suggesting that nucleophilicity of the cysteine, located at the N terminus of α-helix, might be primarily increased by α-helix dipole effect itself (21Davies C. Heath R.J. White S.W. Rock C.O. The 1.8 A crystal structure and active-site architecture of beta-ketoacyl-acyl carrier protein synthase III (FabH) from Escherichia coli.Structure. 2000; 8: 185-195Google Scholar). Still, the majority of cysteine proteinases deploy a catalytic triad with additional polar residue orienting histidine's imidazole ring toward cysteine and increasing its basicity (22Moult J. Sussman F. James M.N. Electron density calculations as an extension of protein structure refinement. Streptomyces griseus protease A at 1.5 A resolution.J. Mol. Biol. 1985; 182: 555-566Google Scholar). The topology of the dyad–triad is conserved across the superfamily with only a few exceptions known from structural studies. For instance, GP42 transglutaminase from Phytophthora sojae lacks histidine at its canonical position—this residue is located immediately after the cysteine on α-helix and forms a salt bridge with aspartate from another nearby α-helix (23Reiss K. Kirchner E. Gijzen M. Zocher G. Löffelhardt B. Nürnberger T. et al.Structural and phylogenetic analyses of the GP42 transglutaminase from Phytophthora sojae reveal an evolutionary relationship between oomycetes and marine Vibrio bacteria.J. Biol. Chem. 2011; 286: 42585-42593Google Scholar); human Atg4b has its histidine located at third β-strand, two positions before aspartate (24Kumanomidou T. Mizushima T. Komatsu M. Suzuki A. Tanida I. Sou Y.-S. et al.The crystal structure of human Atg4b, a processing and de-conjugating enzyme for autophagosome-forming modifiers.J. Mol. Biol. 2006; 355: 612-618Google Scholar); transglutaminase from Streptomyces mobaraense has histidine and aspartate residues swapped (25Kashiwagi T. Yokoyama K.-I. Ishikawa K. Ono K. Ejima D. Matsui H. et al.Crystal structure of microbial transglutaminase from Streptoverticillium mobaraense.J. Biol. Chem. 2002; 277: 44252-44260Google Scholar). The observed abundance of functions, specificities, taxonomic distributions, and ecological contexts entails a great diversity in structures and sequences of papain-like peptidases. Despite sharing a common structural core, these proteins display a remarkably rich repertoire of insertions (e.g., the "fingers" domain of deubiquitinases (26Avvakumov G.V. Walker J.R. Xue S. Finerty Jr., P.J. Mackenzie F. Newman E.M. et al.Amino-terminal dimerization, NRDP1-rhodanese interaction, and inhibited catalytic domain conformation of the ubiquitin-specific protease 8 (USP8).J. Biol. Chem. 2006; 281: 38061-38070Google Scholar)), deletions (e.g., viral accessory proteins (27Sun Y. Xue F. Guo Y. Ma M. Hao N. Zhang X.C. et al.Crystal structure of porcine reproductive and respiratory syndrome virus leader protease Nsp1alpha.J. Virol. 2009; 83: 10931-10940Google Scholar)), permutations (28Iyer L.M. Koonin E.V. Aravind L. Novel predicted peptidases with a potential role in the ubiquitin signaling pathway.Cell Cycle. 2004; 3: 1440-1450Google Scholar, 29Xu Q. Rawlings N.D. Chiu H.-J. Jaroszewski L. Klock H.E. Knuth M.W. et al.Structural analysis of papain-like NlpC/P60 superfamily enzymes with a circularly permuted topology reveals potential lipid binding sites.PLoS One. 2011; 6e22013Google Scholar, 30Anantharaman V. Aravind L. Evolutionary history, structural features and biochemical diversity of the NlpC/P60 superfamily of enzymes.Genome Biol. 2003; 4: R11Google Scholar), and overall spatial deteriorations. This hinders the classification and identification of new representatives of this superfamily, even at the level of their structures, which tend to display diversity surpassing the capabilities of automated structure comparison approaches. This project aims to collect and classify sequences and structures of all known cysteine proteinases, distill a set of features describing these proteins, and identify new superfamily members. SCOP database assigns 24 families to the "cysteine proteinases" superfamily (d.3.1), whereas Pfam, a protein family database, classifies 78 protein families into the Peptidase_CA clan (CL0125), including proteases, acetyltransferases, transglutaminases, deubiquitinases, amidases, and multiple domains of unknown function (DUFs). MEROPS defines five clans (CA, CE, CN, CO, and CP) spanning 54 families crosslinked to 40 Pfam families from the Peptidase_CA clan. In addition, five more MEROPS families, that is, C7, C23, C27, C36, and C42, remain unassigned to any MEROPS clan despite linking to the aforementioned Pfam clan. There have been few reviews discussing papain-like families to date. In 1994, Rawlings and Barrett (31Rawlings N.D. Barrett A.J. Families of cysteine peptidases.Met. Enzymol. 1994; 244: 461-486Google Scholar) identified 21 cysteine peptidase families, of which, according to the authors, three were related to papain (C1, C2, and C10). From the current point of view, that dataset included 12 papain-like, three caspase-like, and two trypsin-like families. They also recognized that cysteine peptidases may differ in His–Cys order within amino acid sequences; however, because of the lack of structural data, it was hardly possible to recognize circular permutations at that time. A year later, they classified 35 peptidase families into five clans: CA, CB, CC, CD, and CE (32Barrett A.J. Rawlings N.D. Families and clans of cysteine peptidases.Perspect. Drug Discov. Des. 1996; 6: 1-11Google Scholar), and the next follow-up article covered seven clans and 48 MEROPS families, of which two clans and 27 families were related to papain (18Barrett A.J. Rawlings N.D. Evolutionary lines of cysteine peptidases.Biol. Chem. 2001; 382: 727-733Google Scholar). In 1999, Koonin's group (33Makarova K.S. Aravind L. Koonin E.V. A superfamily of archaeal, bacterial, and eukaryotic proteins homologous to animal transglutaminases.Protein Sci. 1999; 8: 1714-1719Google Scholar) described a novel papain-like superfamily of transglutaminases containing seven protein families; according to the current Pfam database, five of those families are now included in Transglut_core PF01841 (families numbered 1, 4, 5, 6, and 7 in the original article), one in DUF553 PF04473 (no. 3), and one family has not been defined in Pfam yet (no. 2 therein). Further work on the identification and classification of papain-like enzymes was done by Aravind et al. (30Anantharaman V. Aravind L. Evolutionary history, structural features and biochemical diversity of the NlpC/P60 superfamily of enzymes.Genome Biol. 2003; 4: R11Google Scholar) who described NlpC/P60 superfamily as related to papain. The report included four major protein families: NLPC_P60, CHAP, C92, and lecithin retinol acyltransferase (LRAT) and systematically discussed circular permutations and evolutionary relationships within the superfamily. Furthermore, based on sequence similarity searches and biological context analyses, PPPDE proteins could be identified as papain-like enzymes (28Iyer L.M. Koonin E.V. Aravind L. Novel predicted peptidases with a potential role in the ubiquitin signaling pathway.Cell Cycle. 2004; 3: 1440-1450Google Scholar). Eventually, two major works aimed to provide a general view of the evolutionary history of the peptidase universe. In 2007, Di Cera et al. (1Page M.J. Di Cera E. Evolution of peptidase diversity.J. Biol. Chem. 2008; 283: 30010-30014Google Scholar) analyzed the distribution of MEROPS families within sequenced genomes to identify sets of ubiquitous (16 families) and kingdom-specific sets of peptidases. Similarly, in 2019, Rawlings and Bateman (2Rawlings N.D. Bateman A. Origins of peptidases.Biochimie. 2019; 166: 4-18Google Scholar) surveyed the distribution of 271 peptidase families across the tree of life and found 33 to be common to all cellular organisms. There has been neither comprehensive analysis of papain-like cysteine peptidase diversity, which would discuss neither complex sequence-to-structure-to-function relationships nor systematic searches for new superfamily members. The aim of this report is to identify new members of papain-like superfamily as well as to categorize the diversity of these proteins. By using sequence-based remote homology detection methods, we newly classify 28 Pfam and three MEROPS families, seven human proteins, as well as 16 Protein Data Bank representatives (PDB90) not yet belonging to any family as papain-like. Based on manually curated superimpositions, we provide structure-based multiple sequence alignments for the first time outlining the most complete view on diversity within all known families retaining cysteine proteinase fold. Starting from the initial set of 78 Pfam and 59 MEROPS families of papain-like proteins, we identified additional 28 Pfam and 30 MEROPS families, 25 Clusters of Orthologous Groups of proteins (COG) and 47 EuKaryotic Orthologous Group (KOG) families, as well as 21 PDB90 structures and seven human proteins not belonging to any Pfam family (Table S1). The dataset counts over 4.8 million nonredundant protein sequences, 2.2 million of which belong to single family PF08715 of coronavirus papain-like peptidases. Markov clustering (MCL) algorithm based on hmmscan scores defined 89 groups of closely related families represented by Pfam, PDB90, COG/KOG, MEROPS, or human proteins (Fig. 2 and Table S1). Further clustering based on HHsearch scores that reflect more distant similarities allowed us to define 10 bigger "meta" groups, each containing at least two Pfam families, which share multiple structural features.Figure 2Structure-guided multiple sequence alignment of structural core elements for proteins representing all identified papain-like families. The columns from the left: group number, cluster number, MEROPS clan, MEROPS family, Pfam family, representative protein identifier—either National Center for Biotechnology Information GenBank entry or Protein Data Bank code. Tags for families newly classified as papain-like in this article are written in red font. The numbers within the alignment area denote residues omitted from the alignment: in parentheses—insertions between two elements; in square brackets—C-terminal part; without any brackets—the number of the first residue in the following block. Sequences written in italic indicate permuted elements, whereas sequences in turquoise are written in reverse and correspond to structural elements oriented in a reverse direction in 3D space; these sequences are followed by their first residue number. Sequence conservation is marked with highlights as follows: yellow—nonpolar, gray—charged, black/green/red—confirmed or predicted active-site residues (red for proteins with serine instead of cysteine), magenta—Asp/Arg residues characteristic for NlpC/P60-like proteins.View Large Image Figure ViewerDownload Hi-res image Download (PPT) Despite being assigned to the Pfam CL0125 clan, eight protein families were still annotated as DUFs with no additional information regarding their probable biological roles and sequence–structure alignment to already known papain-like domains. Peptidase family C8 (PF03569) belonging to the CL0125 Pfam clan could not be confirmed as papain-like according to the methods used. Moreover, two of the identified families, PF18021 and PF18022, were assigned to the Trefoil clan (CL0066), probably according to the beta-trefoil lectin-like domain preceding the peptidase domain—here, we classify them as papain-like. Eventually, 16 Pfam families, 13 of which retain conserved catalytic residues, had not been previously classified or annotated as related to papain-like cysteine peptidases. Seven human proteins: BIVM (Swiss-Prot [SP]: Q86UB2, basic immunoglobulin-like variable motif-containing protein), DRC7 (SP: Q8IY82, dynein regulatory complex subunit 7), CEP76 (SP: Q8TAP6, centrosomal protein), C2D2B (SP: Q6DHV5), C2D2A (SP: Q9P2K1), C14orf28 (SP: Q4W4Y0), VRTN (SP: Q9H8Y1, vertnin), of which only three lack potential catalytic sites (C2D2A, C2D2B, and vertnin), had also not been previously recognized to contain papain-like domains. Papain-like cysteine proteinases share a common structural core comprising α-helix followed by, depending on definition, four to five β-strands forming an antiparallel β-sheet. Based on structural alignments for the whole superfamily, we add an additional β-strand to this definition (β1'), which, despite being highly variable, is still present in the majority of the families (Fig. 3A, PDB ID: 5JT8). Quite unusually for protein structural classifications where fold definitions generally are clear and adequately represent common parts of protein structures, members of this superfamily adopt an extraordinary repertoire of alterations, which in many cases means lack of the alleged core elements. Commonly recognized are circular permutations of the α-helix alone or together with β1' (Fig. 3B, PDB ID: 3EBQ, and C, PDB ID: 3KW0) as had been thoroughly described for, for example, deubiquitinases (28Iyer L.M. Koonin E.V. Aravind L. Novel predicted peptidases with a potential role in the ubiquitin signaling pathway.Cell Cycle. 2004; 3: 1440-1450Google Scholar) and NlpC/P60-like enzymes (29Xu Q. Rawlings N.D. Chiu H.-J. Jaroszewski L. Klock H.E. Knuth M.W. et al.Structural analysis of papain-like NlpC/P60 superfamily enzymes with a circularly permuted topology reveals potential lipid binding sites.PLoS One. 2011; 6e22013Google Scholar). The permutation events may also affect other elements that are often associated with the reversed direction of β-strands within the β-sheet. For instance, agglutinin from mushroom Marasmius oreades (34Cordara G. Manna D. Krengel U. Family of papain-like fungal chimerolectins with distinct Ca2+-dependent activation mechanism.Biochemistry. 2017; 56: 4689-4700Google Scholar) (Fig. 3D, PDB ID: 5MU9) has β-strand 1′ permuted and reversed; Legionella pneumophila SdeA deubiquitinase (Fig. 3E, PDB ID: 5CRA (35Sheedlo M.J. Qiu J. Tan Y. Paul L.N. Luo Z.-Q. Das C. Structural basis of substrate recognition by a bacterial deubiquitinase important for dynamics of phagosome ubiquitination.Proc. Natl. Acad. Sci. U. S. A. 2015; 112: 15090-15095Google Scholar)) lacks β-strand 1′, and its β-strands 4 and 5 have reversed direction, the latter being in addition permuted; human OTUD5 protein (Fig. 3F, PDB ID: 3PFY) β-strand 1′ is reversed, fourth β-strand missing, and β-strands 3 and 5 permuted—the latter in addition reversed. Interestingly, the third β-strand of OTUD5 still harbors active site glutamine. Many viral papain-like peptidases are severely reduced in structure missing β-strands 3 and 4, for example, Nsp1 protease from porcine reproductive and respiratory syndrome virus (27Sun Y. Xue F. Guo Y. Ma M. Hao N. Zhang X.C. et al.Crystal structure of porcine reproductive and respiratory syndrome virus leader protease Nsp1alpha.J. Virol. 2009; 83: 10931-10940Google Scholar) (Fig. 3G, PDB ID: 3IFU) or are minimized to contain only α-helix and two β-strands like potyvirus helper–component proteinase (Fig. 3H, PDB ID: 3RNV). Another viral peptidase/deubiquitinase from turnip yellow mosaic virus (36Jupin I. Ayach M. Jomat L. Fieulaine S. Bressanelli S. A mobile loop near the active site acts as a switch between the dual activities of a viral protease/deubiquitinase.PLoS Pathog. 2017; 13e1006714Google Scholar) (Fig. 3I, PDB ID: 5LWA) lacks β-strands 3 and 4, its β-strands 1′ and 5 are reversed, the latter being in addition permuted. The canonical catalytic triad of cysteine peptidase consists of cysteine, histidine, and third polar residue: asparagine, glutamine, aspartate, or glutamate, and is indeed present in the majority of papain-like families (Fig. 4A, PDB ID: 5JT8). Although cysteine residue is always located at the very N terminus of the core α-helix regardless of other structural aberrations, many alternative active-site architectures may be observed. The most obvious is the reduction to catalytic cysteine–histidine dyad characteristic of multiple peptidases associated with virulence, for example, C32, C31, C07, C42, C06, C42, C09, C23, C34, C36 as well as in OTU-like deubiquitinases (clusters V and VII). Many of these proteins lack papain-like structural core elements beyond the second β-strand (harboring histidine) or retained the following elements remnant or permuted. Nevertheless, the structural permutation of β3 does not impose the lack of a third catalytic residue as can be observed in C33, Ceg23, and LotA, neither does the presence of a complete structural core requires the triad, for example, C10, Amidase_6 (PF12671), C115, DUF2459 (PF09601), C97, and Calici_PP_N viral polyprotein (PF08405). The catalytic histidine may migrate from its canonical position to nearby structural elements and even lose its catalytic function. The microbial transglutaminase from Streptomyces (Fig. 4B, PDB ID: 3IU0, PDB ID: 1IU4) has histidine residue switched with aspartate (forming CDH triad), and now, aspartate plays an essential role in catalysis (25Kashiwagi T. Yokoyama K.-I. Ishikawa K. Ono K. Ejima D. Matsui H. et al.Crystal structure of microbial transglutaminase from Streptoverticillium mobaraense.J. Biol. Chem. 2002; 277: 44252-44260Google Scholar). Similarly, transglutaminase from Bacillus (Fig. 4C, PDB ID: 4P8I) also uses glutamate instead of histidine; however, this protein may operate as well based on an alternative catalytic dyad of cysteine and another, directly preceding it glutamate (37Fernandes C.G. Plácido D. Lousa D. Brito J.A. Isidro A. Soares C.M. et al.Structural and functional characterization of an ancient bacterial transglutaminase sheds light on the minimal requirements for protein cross-linking.Biochemistry. 2015; 54: 5723-5734Google Scholar). DUF1839 (Fig. 4D, PF08893) proteins retain C(T/S)H catalytic triad in which histidine switched its place with threonine or serine. According to 3D modeling, threonine/serine's hydroxyl group interacts with histidine and orients it against cysteine. In human Atg4b (Fig. 4E, PDB ID: 2D1I) (24Kumanomidou T. Mizushima T. Komatsu M. Suzuki A. Tanida I. Sou Y.-S. et al.The crystal structure of human Atg4b, a processing and de-conjugating enzyme for autophagosome-forming modifiers.J. Mol. Biol. 2006; 355: 612-618Google Scholar) and UfSP1 (38Ha B.H. Ahn H.-C. Kang S.H. Tanaka K. Chung C.H. Kim E.E.K. Structural basis for Ufm1 processing by UfSP1.J. Biol. Chem. 2008; 283: 14893-14900Google Scholar) deconjugating enzymes, catalytic histidine is located on strand β3 after aspartate, resulting in CDH triad where aspartate and histidine form "Asp–Pro–His" box defining unique group of the papain-like superfamily. A highly uncommon catalytic site may be found in GP42 transglutaminase (Fig. 4F, PDB ID: 3TW5), which has its histidine moved away from the canonical location to the position right after cysteine to form the "CH" motif interacting with catalytically important asparagine (located on β2) or aspartate (located on other noncore α-helix) (23Reiss K. Kirchner E. Gijzen M. Zocher G. Löffelhardt B. Nürnberger T. et al.Structur
Referência(s)