The Annotation of Both Human and Mouse Kinomes in UniProtKB/Swiss-Prot
2008; Elsevier BV; Volume: 7; Issue: 8 Linguagem: Inglês
10.1074/mcp.r700001-mcp200
ISSN1535-9484
AutoresS. Braconi Quintaje, Sandra Orchard,
Tópico(s)Glycosylation and Glycoproteins Research
ResumoBiomolecule phosphorylation by protein kinases is a fundamental cell signaling process in all living cells. Following the comprehensive cataloguing of the protein kinase complement of the human genome (Manning, G., Whyte, D. B., Martinez, R., Hunter, T., and Sudarsanam, S. (2002) The protein kinase complement of the human genome. Science 298, 1912–1934), this review will detail the state-of-the-art human and mouse kinase proteomes as provided in the UniProtKB/Swiss-Prot protein knowledgebase. The sequences of the 480 classical and up to 24 atypical protein kinases now believed to exist in the human genome and 484 classical and up to 24 atypical kinases within the mouse genome have been reviewed and, where necessary, revised. Extensive annotation has been added to each entry. In an era when a wealth of new databases is emerging on the Internet, UniProtKB/Swiss-Prot makes available to the scientific community the most up-to-date and in-depth annotation of these proteins with access to additional external resources linked from within each entry. Incorrect sequence annotations resulting from errors and artifacts have been eliminated. Each entry will be constantly reviewed and updated as new information becomes available with the orthologous enzymes in related species being annotated in a parallel effort and complete kinomes being completed as sequences become available. This ensures that the mammalian kinomes available from UniProtKB/Swiss-Prot are of a consistently high standard with each separate entry acting both as a valuable information resource and a central portal to a wealth of further detail via extensive cross-referencing. Biomolecule phosphorylation by protein kinases is a fundamental cell signaling process in all living cells. Following the comprehensive cataloguing of the protein kinase complement of the human genome (Manning, G., Whyte, D. B., Martinez, R., Hunter, T., and Sudarsanam, S. (2002) The protein kinase complement of the human genome. Science 298, 1912–1934), this review will detail the state-of-the-art human and mouse kinase proteomes as provided in the UniProtKB/Swiss-Prot protein knowledgebase. The sequences of the 480 classical and up to 24 atypical protein kinases now believed to exist in the human genome and 484 classical and up to 24 atypical kinases within the mouse genome have been reviewed and, where necessary, revised. Extensive annotation has been added to each entry. In an era when a wealth of new databases is emerging on the Internet, UniProtKB/Swiss-Prot makes available to the scientific community the most up-to-date and in-depth annotation of these proteins with access to additional external resources linked from within each entry. Incorrect sequence annotations resulting from errors and artifacts have been eliminated. Each entry will be constantly reviewed and updated as new information becomes available with the orthologous enzymes in related species being annotated in a parallel effort and complete kinomes being completed as sequences become available. This ensures that the mammalian kinomes available from UniProtKB/Swiss-Prot are of a consistently high standard with each separate entry acting both as a valuable information resource and a central portal to a wealth of further detail via extensive cross-referencing. In the late 1950s, the role of reversible phosphorylation in enzymatic regulation was recognized by Fischer et al. (1Fischer E.H. Kent A.B. Krebs E.G. The muscle phosphorylase b kinase reaction.J. Biol. Chem. 1958; 231: 73-83Abstract Full Text PDF PubMed Google Scholar). Phosphorylation events are most commonly mediated by protein kinases, which transfer the γ-phosphate from nucleotides, usually ATP, via a phosphoester bond (O-phosphate) to the hydroxyl side chain of serine, threonine, or tyrosine residues on their protein substrates. Phosphates are bulky, negatively charged groups, and their addition to a protein can result in a profound change in its interactions with other molecules or subcellular location and/or to a conformational change of the protein itself. Kinase-mediated protein phosphorylation can be reversed through dephosphorylation after which the protein switches back to its original charge state and conformation. As protein conformation often determines function, the phosphorylation event may be considered a type of molecular switch, turning the activity of the molecule on or off. These reversible and dynamic phosphorylation events are under tight control, being governed by the opposing activities of protein kinases and protein phosphatases.Eukaryotic protein kinases (ePKs) 1The abbreviations used are: ePK, eukaryotic protein kinase; aPK, atypical protein kinase; UniProtKB, UniProt KnowledgeBase; CDK, cyclin-dependent kinase; CAMK, Ca2+/calmodulin-dependent protein kinase; EC, Enzyme Classification; PH, pleckstrin homology; SH, SRC homology; NEK, NIMA-related kinase; MAPK, mitogen-activated protein kinase; MAP, mitogen-activated protein; CK, casein kinase; AID, autoinhibitory domain; FGFR2, fibroblast growth factor receptor 2; DOK1, docking protein 1; SORBS1; sorbin and SH3 domain-containing protein 1; CABLES1, CDK5 and ABL1 enzyme substrate 1; INPPL1, inositol polyphosphate phosphatase-like protein 1; PRKDC, DNA-dependent protein kinase catalytic subunit. 1The abbreviations used are: ePK, eukaryotic protein kinase; aPK, atypical protein kinase; UniProtKB, UniProt KnowledgeBase; CDK, cyclin-dependent kinase; CAMK, Ca2+/calmodulin-dependent protein kinase; EC, Enzyme Classification; PH, pleckstrin homology; SH, SRC homology; NEK, NIMA-related kinase; MAPK, mitogen-activated protein kinase; MAP, mitogen-activated protein; CK, casein kinase; AID, autoinhibitory domain; FGFR2, fibroblast growth factor receptor 2; DOK1, docking protein 1; SORBS1; sorbin and SH3 domain-containing protein 1; CABLES1, CDK5 and ABL1 enzyme substrate 1; INPPL1, inositol polyphosphate phosphatase-like protein 1; PRKDC, DNA-dependent protein kinase catalytic subunit. play a key role in cell communication pathways and in the transmission of information from outside the cell or between subcellular components within the cell. The ePKs constitute one of the largest mammalian gene families comprising ∼1.7–2.5% of genes in eukaryotic genomes. Most protein kinases belong to a single superfamily, containing a conserved ePK catalytic domain that consists of a mainly β-sheet, NH2-terminal subdomain and a larger α-helical COOH-terminal subdomain with the ATP-binding pocket situated between the two subdomains. Serine/threonine protein kinases constitute the majority of kinases (67%) within the human kinome; however, tyrosine protein kinases (17%) also play a key role in signaling mechanisms, particularly in cell-cell communication in multicellular organisms (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar). The remainder, the atypical protein kinases (aPKs), lack sequence similarity to the ePK catalytic domain but are known to have catalytic functional activity.THE CONTENT OF A UniProt KnowledgeBase (UniProtKB)/Swiss-Prot KINASE ENTRYThe UniProtKB consists of two sections (3The UniProt ConsortiumThe universal protein resource (UniProt).Nucleic Acids Res. 2007; 35: D193-D197Crossref PubMed Scopus (458) Google Scholar). UniProtKB/Swiss-Prot contains records combining full manual annotation with computer-assisted, manually verified annotation performed by biologists and biochemists and based on published literature and sequence analysis. UniProtKB/TrEMBL contains records with computationally generated annotation and large scale functional characterization. UniProtKB/Swiss-Prot records provide an integrated presentation of annotations such as protein name and function, taxonomy, enzyme-specific information (catalytic activity, cofactors, metabolic pathway, and regulatory mechanisms), domains and sites, post-translational modifications, subcellular locations, tissue-specific or developmentally specific expression, interactions, and diseases. Literature citations provide evidence from experimental data, which, along with feedback information from contacted authors, are regarded as information of the highest value, and are constantly being added to each record as they become available.To aid the user and to enable text miners to make maximal use of this wealth of knowledge, this information is added to a UniProtKB/Swiss-Prot record in specific comments fields and, where possible, following a defined syntax and utilizing a controlled vocabulary. It is clearly indicated within the record when experimental evidence has been transferred from an orthologous protein in a closely related species. The references from which the data have been collected are retained within the entry, and the information extracted from that publication is also clearly described (Fig. 1).As part of the manual curation process to create the UniProt/Swiss-Prot record for each gene product, each related sequence in the database is examined; splice variants and amino acid polymorphisms are identified, and sequencing errors, such as frameshifts or premature stop codons, are corrected. All the information is documented within the Swiss-Prot record such that the user may trace it back to its original source if required. A feature table at the end of each entry maps protein domains, active sites, binding sites, modified residues, and other sequence features onto the given sequence.THE PROTEIN KINASE CONTENT OF THE HUMAN GENOMEIn 2002 Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar) predicted the existence of 518 typical protein-coding protein kinase genes within the human kinome based on the then current public and proprietary genomic DNA, complementary DNA, and expressed sequence tag sequences. These 518 kinases were further subdivided into 478 ePKs and 40 aPKs. At the time of the publication of that study, many of the sequences were not available in the public domain, and the sequencing of the human genome was far from complete. We now believe the number of ePKs to be 480 with the divergent isoforms of PRKG1 accessible in UniProtKB release 13.1 (March 18, 2008) from two separate entries (Q13976 and P14619). One sequence described by Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar), SgK424, may have been an erroneous prediction in that we can find no firm evidence for its existence, although part of the sequence is identifiable within the current genome build, and a further three proteins were recognized subsequent to the Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar) publication. Two gene products have now been identified within a duplicated genomic region that gives rise to two proteins, CDC2L1 (P21127) and CDC2L2 (Q9UQ88), that differ by only 15 amino acids (4Gururajan R. Lahti J.M. Grenet J.A. Easton J. Gruber I. Ambros P.F. Kidd V.J. Duplication of a genomic region containing the Cdc2L1–2 and MMP21–22 genes on human chromosome 1p36.3 and their linkage to D1Z2.Genome Res. 1998; 8: 929-939Crossref PubMed Scopus (51) Google Scholar). The second of these, CDC2L2, was not on the original list. PAN3 (Q58A45), a subunit of a poly(A)-specific ribonuclease complex (5Uchida N. Hoshino S. Katada T. Identification of a human cytoplasmic poly(A) nuclease complex stimulated by poly(A)-binding protein.J. Biol. Chem. 2004; 279: 1383-1391Abstract Full Text Full Text PDF PubMed Scopus (106) Google Scholar), also contains a previously unidentified kinase domain, although this appears to be catalytically inactive. Finally PLK5 was originally found only in mouse, but a human homologue (Q496M5) has now been identified.THE PROTEIN KINASE CONTENT OF THE MOUSE GENOMEAn initial analysis of the mouse kinome was published by Caenepeel et al. (6Caenepeel S. Charydczak G. Sudarsanam S. Hunter T. Manning G. The mouse kinome: discovery and comparative genomics of all mouse protein kinases.Proc. Natl. Acad. Sci. U. S. A. 2004; 101: 11707-11712Crossref PubMed Scopus (247) Google Scholar) in 2004 in which they identified a complement of 540 gene products, 510 of which are orthologues of the human enzymes. Our analysis suggests a total of 484 ePKs. As with the human kinome, the 97 pseudogenes identified by Caenepeel et al. (6Caenepeel S. Charydczak G. Sudarsanam S. Hunter T. Manning G. The mouse kinome: discovery and comparative genomics of all mouse protein kinases.Proc. Natl. Acad. Sci. U. S. A. 2004; 101: 11707-11712Crossref PubMed Scopus (247) Google Scholar) have not been annotated in the UniProtKB/Swiss-Prot database. Eight kinases were originally identified as being present in human but not mouse; however, one of these, cyclin-dependent kinase 3 (CDK3) (Q80YP0), was identified as a transcribed mouse pseudogene with a single gene within the kinase domain near the T loop that is involved in activation by CDK-activating kinase and deletes motif X known to be required for kinase function (7Ye X. Zhu C. Harper J.W. A premature-termination mutation in the Mus musculus cyclin-dependent kinase 3 gene.Proc. Natl. Acad. Sci. U. S. A. 2001; 98: 1682-1686Crossref PubMed Scopus (78) Google Scholar). Consequently the truncated protein generates a null allele. This mutation is found in laboratory strains but not in wild-mice species such as Mus spretus and Mus mus castaneus. Because of the existence of at least one mRNA (BC116895 + BC119894) and one expressed sequence tag (BY709505) confirming the full-length protein CDK3, an annotated version in the UniProtKB/Swiss-Prot database has been made publicly available.Finally Caenepeel et al. (6Caenepeel S. Charydczak G. Sudarsanam S. Hunter T. Manning G. The mouse kinome: discovery and comparative genomics of all mouse protein kinases.Proc. Natl. Acad. Sci. U. S. A. 2004; 101: 11707-11712Crossref PubMed Scopus (247) Google Scholar) described a group of microtubule affinity-regulating kinase-related CAMKs that could not be separated because of high sequence similarity and the, then, poor genome assembly. This group of kinases are encoded on the t-complex, a region of 20–30 Mb on the proximal third of mouse chromosome 17. Naturally occurring variant forms of the t-complex, known as complete t-haplotypes, are found in wild mouse populations. The t-haplotypes contain at least four nonoverlapping inversions that suppress recombination with the wild-type chromosome and lock into strong linkage disequilibrium loci affecting normal transmission of the chromosome, male gametogenesis, and embryonic development. To date, 10 protein kinases have now been identified in this region, all appearing to play a role in sperm motility.ATYPICAL PROTEIN KINASESAtypical protein kinases lack sequence similarity with the classical ePKs and are often deficient in the usual kinase motifs, although many possess a common kinase-like structural fold described by SUPERFAMILY hidden Markov model (InterPro entry IPR011009) (8Mulder N.J. Apweiler R. Attwood T.K. Bairoch A. Bateman A. Binns D. Bork P. Buillard V. Cerutti L. Copley R. Courcelle E. Das U. Daugherty L. Dibley M. Finn R. Fleischmann W. Gough J. Haft D. Hulo N. Hunter S. Kahn D. Kanapin A. Kejariwal A. Labarga A. Langendijk-Genevaux P.S. Lonsdale D. Lopez R. Letunic I. Madera M. Maslen J. McAnulla C. McDowall J. Mistry J. Mitchell A. Nikolskaya A.N. Orchard S. Orengo C. Petryszak R. Selengut J.D. Sigrist C.J.A. Thomas P.D. Valentin F. Wilson D. Wu C.H. Yeats C. New developments in the InterPro database.Nucleic Acids Res. 2007; 35: D224-D228Crossref PubMed Scopus (416) Google Scholar), suggesting a shared ancestry for these proteins. A recent study of the structural evolution of the kinase family suggested that these atypical kinases diverged early in evolution to form a distinct phylogenetic group. The study encompassed a broader group of kinases, including enzymes capable of phosphorylating small molecules, such as choline kinase, and lipids. The exception to this appeared to be the α-kinases, such as α-protein kinase 1 (Q96QP1), a small family of enzymes that recognize phosphorylation sites in which the surrounding peptides have an α-helical conformation and that contain a zinc finger motif. This group appears to have arisen fairly recently in eukaryotes, and the authors speculate that this may be due to a single gene event such as a deletion of the COOH-terminal end of the gene or a gene fusion event. Members of this family are identified within a UniProt/Swiss-Prot record by the SIMILARITY comment “Belongs to the protein kinase superfamily.”The aPKs can only be identified by functional experimentation, and since the publication of the original list by Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar), several more proteins that appear to display protein kinase activity have been identified, for example COL4A3BP (Q9Y5P4), CPNE3 (O75131), and GTF2F1 (P35269). For some of these, only a single report exists, and the observation has yet to be confirmed; in these cases, the reported kinase activity may only be added as a CAUTION comment. As a result, our classification of a protein as an aPK is somewhat more conservative than that of Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar), including only 24 human proteins and 24 mouse proteins. However, where a related domain has been found within a sequence, this fact is recorded within the entry: many proteins identified by Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar) as atypical fall within this category. It is anticipated that more proteins may be found by bench scientists as having an atypical kinase mechanism as time progresses. Manning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar) also identified 106 pseudogenes with similarity to either an ePK or aPK; however, UniProtKB is only concerned with the annotation of proteins, so the existence and the significance of these have not been investigated any further by this particular group.INACTIVE PROTEIN KINASESManning et al. (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar) predicted that up to 50 human protein kinases would be catalytically inactive because of the loss of one or more highly conserved amino acids of either the HRD motif (that precedes the catalytic loop at the active site) or DFG motif (in the activation segment) or would contain, in addition to a fully functional protein kinase domain, a second pseudokinase domain that has lost its catalytic function but may have acquired some new but unknown functions (2Manning G. Whyte D.B. Martinez R. Hunter T. Sudarsanam S. The protein kinase complement of the human genome.Science. 2002; 298: 1912-1934Crossref PubMed Scopus (6126) Google Scholar, 9Wilks A.F. Harpur A.G. Kurban R.R. Ralph S.J. Zuercher G. Ziemiecki A. Two novel protein-tyrosine kinases, each with a second phosphotransferase-related catalytic domain, define a new class of protein kinase.Mol. Cell. Biol. 1991; 11: 2057-2065Crossref PubMed Scopus (411) Google Scholar). Within UniProtKB/Swiss-Prot, those ePK proteins that lack catalytic activity are tagged with the DOMAIN comment “The protein kinase domain is predicted to be catalytically inactive,” and 23 of these proteins are recognized as such within the human kinase set; 22 of these proteins are recognized as such within the mouse. Where the alternative function of these proteins is known, this has been annotated in full. However, it should be noted that there are kinases that would be predicted to be inactive but do in fact retain full catalytic activity, for example members of the WNK (with no lysine) family, so these may yet prove to be overpredictions. Those proteins with an active kinase domain and a second pseudokinase domain, such as JAK1 (P23458), are described as such within the entry.PROTEIN KINASE NOMENCLATUREAs part of the annotation process, UniProtKB strives to provide a unified name for both gene and protein product that describes the function of the enzyme while retaining a unique identification that is recognizable by the scientific community; it is most commonly based on the gene name (Fig. 2). The primary name follows a series of rules, which have been made public by the UniProtKB consortium, that enable a protein to be assigned a recommended name providing the maximal amount of information about that gene product while still enabling its propagation across orthologues in other organisms. The database works closely with other nomenclature groups, such as the Human Genome Nomenclature Committee (10Bruford E.A. Lush M.J. Wright M.W. Sneddon T.P. Povey S. Birney E. The HGNC Database in 2008: a resource for the human genome.Nucleic Acids Res. 2008; 36: D445-D448Crossref PubMed Scopus (188) Google Scholar) and the Mouse Genome Informatics (11Eppig J.T. Bult C.J. Kadin J.A. Richardson J.E. Blake J.A. Anagnostopoulos A. Baldarelli R.M. Baya M. Beal J.S. Bello S.M. Boddy W.J. Bradt D.W. Burkart D.L. Butler N.E. Campbell J. Cassell M.A. Corbani L.E. Cousins S.L. Dahmen D.J. Dene H. Diehl A.D. Drabkin H.J. Frazer K.S. Frost P. Glass L.H. Goldsmith C.W. Grant P.L. Lennon-Pierce M. Lewis J. Lu I. Maltais L.J. McAndrews-Hill M. McClellan L. Miers D.B. Miller L.A. Ni L. Ormsby J.E. Qi D. Reddy T.B. Reed D.J. Richards-Smith B. Shaw D.R. Sinclair R. Smith C.L. Szauter P. Walker M.B. Walton D.O. Washburn L.L. Witham I.T. Zhu Y. The Mouse Genome Database (MGD): from genes to mice—a community resource for mouse biology.Nucleic Acids Res. 2005; 33: D471-D475Crossref PubMed Scopus (199) Google Scholar), which are cross-referenced from within the appropriate entries. However, many kinases are already known by more than one well accepted name, and where it proves impossible to include these within the primary name, they are retained within the entry as a synonym to allow ease of searching. All entries have a stable accession number and a human readable identification. The latter, however, should not be regarded as stable because these are occasionally updated, for example with change of gene name. The activity of the kinase, as described by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology, is captured within the protein description as the Enzyme Classification (EC) number (12Fleischmann A. Darsow M. Degtyarenko K. Fleischmann W. Boyce S. Axelsen K.B. Bairoch A. Schomburg D. Tipton K.F. Apweiler R. IntEnz, the integrated relational enzyme database.Nucleic Acids Res. 2004; 32: D434-D437Crossref PubMed Google Scholar).Fig. 2Protein nomenclature: ABL1 (P00519).View Large Image Figure ViewerDownload Hi-res image Download (PPT)PROTEIN KINASE FUNCTION AND REGULATIONProtein kinases can be said to share a single function: the transfer of a phosphate group from ATP to a protein substrate molecule. However, the range of protein substrates targeted by these several hundred molecules is very broad, and the downstream consequences of each phosphorylation event are correspondingly varied. This level of detail is impossible to predict and must be collected by careful and thorough reading of the literature. In many cases, the exact target of a kinase may not be known, although the process regulated by the enzyme has been identified. Both the target and downstream effect are summarized in the FUNCTION comment in a UniProtKB/Swiss-Prot entry, but this field is currently empty for many kinases, reflecting the huge amount of work that still remains to be undertaken in the laboratory. Additional information such as the tissues in which these proteins are expressed and the subcellular locations at which they are found are also annotated in the appropriate fields.Additional annotation is included by the incorporation of Gene Ontology cross-references (13Harris M.A. Clark J. Ireland A. Lomax J. Ashburner M. Foulger R. Eilbeck K. Lewis S. Marshall B. Mungall C. Richter J. Rubin G.M. Blake J.A. Bult C. Dolan M. Drabkin H. Eppig J.T. Hill D.P. Ni L. Ringwald M. Balakrishnan R. Cherry J.M. Christie K.R. Costanzo M.C. Dwight S.S. Engel S. Fisk D.G. Hirschman J.E. Hong E.L. Nash R.S. Sethuraman A. Theesfeld C.L. Botstein D. Dolinski K. Feierbach B. Berardini T. Mundodi S. Rhee S.Y. Apweiler R. Barrell D. Camon E. Dimmer E. Lee V. Chisholm R. Gaudet P. Kibbe W. Kishore R. Schwarz E.M. Sternberg P. Gwinn M. Hannick L. Wortman J. Berriman M. Wood V. de la Cruz N. Tonellato P. Jaiswal P. Seigfried T. White R. Gene Ontology ConsortiumThe Gene Ontology (GO) database and informatics resource.Nucleic Acids Res. 2004; 32: 258-261Crossref PubMed Google Scholar), via the Gene Ontology Annotation project (14Camon E. Magrane M. Barrell D. Lee V. Dimmer E. Maslen J. Binns D. Harte N. Lopez R. Apweiler R. The Gene Ontology Annotation (GOA) Database: sharing knowledge in UniProt with Gene Ontology.Nucleic Acids Res. 2004; 32: D262-D266Crossref PubMed Google Scholar), which is contributed to by UniProtKB curators as well as by curators from the model organism databases.However, the generic function of a protein kinase can be described, and indeed this is done within the CATALYTIC ACTIVITY statement added to every enzyme classified by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology EC system. The regulation of protein kinases tends to be at the level of protein binding to, and/or phosphorylation of, each kinase enzyme in a manner that is often either family- or subfamily-specific rather than via changes in protein expression. The existence and potential significance of these phosphorylation sites are again collected from the experimental literature and are detailed in both the ENZYME REGULATION comment and the feature table. Where a protein is known to be post-translationally modified, for example by amino acid phosphorylation, but either the exact amino acid position and/or the effect of this modification is unknown, this information is stated in a POST-TRANSLATIONAL MODIFICATION comment rather than in the feature table. Further post-translational modifications may be required before a kinase can be activated by phosphorylation; for example both the palmitoylation and myristoylation of some kinases are believed to position the protein at a membrane surface possibly resulting in a required conformation change. In addition, there are secondary mechanisms that can adjust the level of protein kinase activity of a phosphorylated protein and may even target it for destruction in the case of protein ubiquitination.In any process governed by protein phosphorylation, a fine balance between the activities of protein kinases and protein phosphatases is vital to cellular physiologic function. Dysregulation of this balance may lead to abnormal cell growth giving rise to complex diseases such as cancer. Protein phosphatases are classified into subfamilies such as serine/threonine-specific, tyrosine-specific, and dual specificity phosphatases. Each of these subclasses has a well conserved but distinct catalytic protein domain, all of which are described in the UniProtKB/Swiss-Prot database.SPLICE VARIANTSAs previously stated, one major task of any UniProtKB curator is the annotation of protein sequences, which includes the identification of splice isoforms. Each of these is given a stable and unique identifier and may be recreated from the feature table of a single UniProtKB entry using the freely available tool VARSPLIC (15Kersey P. Hermjakob H. Apweiler R. VARSPLIC: alternatively-spliced protein sequences derived from Swiss-Prot and TrEMBL.Bioinformatics. 2000; 11: 1048-1049Crossref Scopus (37) Google Scholar) or are directly accessible within a UniProt entry viewed over the Web (Fig. 3). A FASTA-formatted file containing all splice variants annotated in UniProtKB/Swiss-Prot can be downloaded for use with similarity search programs. The differing domain composition of these isoforms can be viewed within InterPro. Protein kinases appear to be highly alternatively spliced with a further 510 isoforms identified, at time of going to press, adding to the 480 human ePKs. This gives an average number of 1.06 additional splice variants per entry for the kinome as opposed to 0.64 for the entire proteome, suggesting that this highly important family of signal transduction proteins requires an increased level of protein variation to maintain a subtle degree of control over cellular processes. It must be noted, however, that this family of proteins has been more intensively studied than the proteome as a whole, and this may account for this higher number of splice isoforms identified. However, as many of the transcripts identified in both sets come from high thro
Referência(s)