throbber
Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Database, 2021, 1–20
`doi:10.1093/database/baab012
`Review
`
`Review
`
`Post-translational modifications in proteins:
`resources, tools and prediction methods
`Shahin Ramazi1,† and Javad Zahiri1,2,3,*,†
`1Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of
`Biological Sciences Tarbiat Modares University, Jalal Ale Ahmad Highway, P.O. Box: 14115-111,
`Tehran, Iran, 2Department of Neuroscience, University of California San Diego, La Jolla, CA, USA and
`3Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
`
`*Corresponding author: Email: Zahiri@modares.ac.ir
`†These authors contributed equally to this work.
`Citation details: Ramazi, S., Zahiri, J. Post-translational modifications in proteins: resources, tools and prediction methods.
`Database (2021) Vol. 2021: article ID baab012; doi:10.1093/database/baab012
`
`Received 12 July 2020; Revised 20 February 2021
`
`Abstract
`Posttranslational modifications (PTMs) refer to amino acid side chain modification in
`some proteins after their biosynthesis. There are more than 400 different types of PTMs
`affecting many aspects of protein functions. Such modifications happen as crucial molec-
`ular regulatory mechanisms to regulate diverse cellular processes. These processes
`have a significant impact on the structure and function of proteins. Disruption in PTMs
`can lead to the dysfunction of vital biological processes and hence to various diseases.
`High-throughput experimental methods for discovery of PTMs are very laborious and
`time-consuming. Therefore, there is an urgent need for computational methods and
`powerful tools to predict PTMs. There are vast amounts of PTMs data, which are publicly
`accessible through many online databases. In this survey, we comprehensively reviewed
`the major online databases and related tools. The current challenges of computational
`methods were reviewed in detail as well.
`
`Introduction
`Posttranslational modifications (PTMs) are covalent pro-
`cessing events that change the properties of a protein by
`proteolytic cleavage and adding a modifying group, such
`as acetyl, phosphoryl, glycosyl and methyl,
`to one or
`more amino acids (1). PTMs play a key role innumerous
`biological processes by significantly affecting the struc-
`ture and dynamics of proteins (2, 3). Generally, a PTM
`can be reversible or irreversible (4). The reversible reac-
`tions contain covalent modifications, and the irreversible
`ones, which proceed in one direction, include proteolytic
`
`modifications (5). PTMs occur in a single type of amino
`acid or multiple amino acids and lead to changes in the
`chemical properties of modified sites (6). PTMs usually
`are seen in the proteins with important structures/functions
`such as secretory proteins, membrane proteins and his-
`tones. These modifications affect a wide range of protein
`behaviors and characteristics, including enzyme function
`and assembly (7), protein lifespan, protein–protein inter-
`actions (8), cell–cell and cell–matrix interactions, molec-
`ular trafficking,
`receptor activation, protein solubility
`(9–14), protein folding (15) and protein localization (16).
`
`Page 1 of 20
`© The Author(s) 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
`(page number not for citation purposes)
`
`Exhibit 2057
`Page 01 of 20
`
`

`

`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Page 2 of 20
`
`Database, Vol. 00, Article ID baab012
`
`Therefore, these modifications are involved in various bio-
`logical processes such as signal transduction, gene expres-
`sion regulation, gene activation, DNA repair and cell cycle
`control (17–19). PTMs occur in various cellular organelles
`including the nucleus, cytoplasm, endoplasmic reticulum
`and Golgi apparatus (5).
`Proximity ligation assay (PLA) is a novel immunoassay
`technology that can be used to study PTMs (20). In addi-
`tion to PLA, immunoprecipitation (IP) is utilized in several
`different PTM detection assays (21). However, the com-
`bination of mass spectrometry with IP strategy is a more
`effective method (22). Nevertheless, large-scale detection of
`PTMs is very costly and challenging. In recent years, com-
`putational methods for predicting PTMs have attracted a
`considerable attention (5, 16, 17, 23–26).
`The rest of this paper is structured as follows. In the
`section ‘The 10 most studied PTMs’, the 10 most stud-
`ied PTMs will be described. Major PTM databases will
`be reviewed in the section ‘The 10 most studied PTMs’
`as well. In the section ‘Involvement of PTMs in diseases
`and biological processes’, involvement of PTMs in diseases
`and biological processes will be discussed. Then, compu-
`tational methods for predicting PTMs will be described in
`the section ‘Computational methods for predicting PTMs’.
`Finally, tools for PTM prediction will be reviewed in the
`section ‘Tools for PTM prediction’.
`
`The 10 most studied PTMs
`There are more than 400 different types of PTMs (27)
`affecting many aspects of protein functions. According
`to the dbPTM (6), one of the most comprehensive PTM
`databases, there are 24 major PTMs, with more than 80
`experimentally verified reported modified sites. Figure 1
`provides a visualized summary of the current major PTM
`data according to the dbPTM. According to Figure 1, we
`can see that some of these major PTMs occur more fre-
`quently and have much more been studied. Three main
`PTMs, based on the dbPTM database, are phosphoryla-
`tion, acetylation and ubiquitination, which comprise more
`than 90% (∼827 000 sites out of ∼908 000) of all the
`reported PTMs Accordingly, each amino acid undergoes at
`least three different PTMs, and Lys undergoes the largest
`number of PTMs (15 PTM types). Moreover, based on
`the whole dbPTM data, Cys and Ser are also modified
`with at least 10 PTM types. Finally, one can see that
`phosphorylation on Ser is the most reported PTM type.
`Figure 1A shows a clustergram, indicating the division
`of the PTMs into four clusters as one can see each phospho-
`rylation, and acetylation has been considered as a separate
`cluster due to their different patterns of modification on the
`
`amino acids. On the other hand, ubiquitination, methyla-
`tion and amidation are the PTMs with many different target
`residues and have been clustered as a group. According to
`the clustergram, amino acids have been divided into five
`clusters. Amino acid Lys is the most different amino acid
`based on the PTM pattern.
`Panels B and C in Figure 1 show the frequency of PTM
`types and amino acids in the dbPTM database in log scale,
`respectively. According to Figure 1, it is observed that phos-
`phorylation, acetylation and ubiquitination are the most
`frequent PTMs.
`Roughly speaking, according to the type of the modi-
`fications, these PTMs can be categorized into three main
`groups. First and second groups are those PTMs that
`include the addition of chemical and complex groups to
`the target residue, respectively. The first group and the
`second group include glycosylation, prenylation, myris-
`toylation and palmitoylation. Those PTMs that contain
`addition of polypeptides to the target residue comprise the
`last group, and these PTMs are ubiquitylation and SUMOy-
`lation. Figure 2 shows a graphical timeline for the discovery
`of these major PTMs. In this timeline, the organisms in
`which each PTM was discovered for the first time also have
`been depicted. In the following subsections, the 10 most
`studied PTMs, out of these major ones, are described in
`more detail.
`
`Phosphorylation
`Protein phosphorylation was first reported in 1906 by
`Phoebus Levene with the discovery of phosphate in the
`protein vitellin (phosvitin) (28). However, it took another
`20 years before Eugene Kennedy described the first enzy-
`matic phosphorylation of proteins (43). This process is an
`important reversible regulatory mechanism that plays a key
`role in the activities of many enzymes, membrane chan-
`nels and many other proteins in prokaryotic and eukaryotic
`organisms (44, 45). Phosphorylation target sites are Ser,
`Thr, Tyr, His, Pro, Arg, Asp and Cys residues (6), but
`this modification mainly happens on Ser, Thr, Tyr and His
`residues (46). This PTM includes transferring a phosphate
`group from adenosine triphosphate to the receptor residues
`by kinase enzymes (Figure 3A). Conversely, dephosphory-
`lating or removal of a phosphate group is an enzymatic
`reaction catalyzed by different phosphatases (47). Phospho-
`rylation is the most studied PTM and one of the essential
`types of PTM, which often happens in cytosol or nucleus
`on the target proteins (48). This modification can change
`the function of proteins in a short time via one of the two
`principal ways: by allostery or by binding to interaction
`domains (49).
`
`Exhibit 2057
`Page 02 of 20
`
`

`

`Database, Vol. 00, Article ID baab012
`
`Page 3 of 20
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 1. Summarized information of major PTMs (24 PTMs with more than 80 experimentally verified reported modified sites) according to the
`dbPTM databank (October 2020). All frequencies are shown in log scale. (A) Clustergram indicating the frequency of each PTM on different amino
`acids. (B) Frequency of major PTMs. (C) Frequency of each amino acid that was reported as a modified site.
`
`Phosphorylation has a vital role in significant cellular
`processes such as replication, transcription, environmental
`stress response, cell movement, cell metabolism, apop-
`tosis and immunological responsiveness (12, 50, 51). It
`has been shown that disruption in the pathway of phos-
`phorylation can lead to various diseases such as can-
`cer, Alzheimer’s disease, Parkinson’s disease and heart
`disease (24, 52, 53).
`
`Acetylation
`The first acetylation modification in proteins was discov-
`ered by V.G. Allfrey in 1964 in isolated calf thymus nuclei
`in vitro (31). Acetylation is catalyzed via lysine acetyl-
`transferase (KAT) and histone acetyltransferase (HAT)
`enzymes. Acetyltransferases use acetyl CoA as a cofac-
`tor for adding an acetyl group (COCH3) to the ε-amino
`group of lysine side chains, whereas deacetylases (HDACs)
`
`Exhibit 2057
`Page 03 of 20
`
`

`

`Page 4 of 20
`
`Database, Vol. 00, Article ID baab012
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 2. Schematic PTM discovery timeline for 10 major PTMs: phosphorylation (28), methylation (29), sulfation (30), acetylation (31), ubiqui-
`tylation (32), prenylation (33), myristoylation (34), SUMOylation (35), palmitoylation (36), different types of glycosylation (N-glycosylation (37),
`O-glycosylation (38), C-glycosylation (39) and S-glycosylation (40)), phosphoglycosylation (41) and glycosylphosphatidylinositol (GPI anchored)
`(42). For each PTM, target residue(s) and the organism in which the related PTM was discovered for the first time are shown.
`
`remove an acetyl group on lysine side chains (Figure 3B)
`(54). There are three forms of acetylation: Nα-acetylation,
`Nε-acetylation and O-acetylation. Nα-acetylation is an
`irreversible modification, and the other two types of acety-
`lation are reversible (55). These three forms of acetylation
`occur on Lys, Ala, Arg, Asp, Cys, Gly, Glu, Met, Pro,
`Ser, Thr and Val residues with different frequencies (6),
`although the acetylation is more reported on Lysine residue.
`Nε-acetylation is more biologically significant compared to
`the other types of acetylation (55).
`Acetylation has an essential role in biological processes
`such as chromatin stability, protein–protein interaction,
`cell cycle control, cell metabolism, nuclear transport and
`actin nucleation (56–58). According to the available evi-
`dence, acetylated lysine is vital for cell development, and
`its dysregulation would lead to serious diseases such as
`cancer, aging,
`immune disorders, neurological diseases
`(Huntington’s disease and Parkinson’s disease) and cardio-
`vascular diseases (56, 59, 60, 61).
`
`Ubiquitylation
`Ubiquitylation is one of the most important reversible
`PTMs. This modification was firstly studied in 1975 by
`Gideon Goldstein (32). This modification is a versatile PTM
`and can occur on all 20 amino acids (Figure 2). However,
`it occurs on lysine more frequently. This PTM has a major
`role in the degradation of intracellular proteins via the ubiq-
`uitin (Ub)–proteasome pathway in all tissues (62). In ubiq-
`uitylation, a covalent bond befalls between the C-terminal
`of an active ubiquitin protein (a polypeptide of 76 amino
`acids) and Nε of a lysine residue of the protein (63). Ubiq-
`uitin can occur in mono- or poly-ubiquitination forms on
`substrate proteins through specific isopeptide bonds by
`receptors containing ubiquitin-binding domains. Ubiqui-
`tylation is catalyzed by an enzyme complex that contains
`
`ubiquitin-activating (E1), ubiquitin-conjugating (E2) and
`ubiquitin ligase (E3) enzymes (Figure 3C). Ubiquitinated
`proteins may be acetylated on Lys, or phosphorylated on
`Ser, Thr or Tyr residues, and lead to dramatically alter-
`ing the signaling outcome (64). Ubiquitylation modification
`in substrate proteins can be removed by several specialized
`families of proteases called deubiquitinases (64).
`Ubiquitination plays important roles in stem cell preser-
`vation and differentiation by regulation of the pluripotency
`(65). Ubiquitylation has also played a vital role in many
`various cell activities such as proliferation, regulation of
`transcription, DNA repair, replication, intracellular traf-
`ficking and virus budding, the control of signal transduc-
`tion, degradation of the protein, innate immune signaling,
`autophagy and apoptosis (12, 66, 67). Dysfunction in
`the ubiquitin pathway can lead to diverse diseases such
`as different cancers, metabolic syndromes, inflammatory
`disorders, type 2 diabetes and neurodegenerative diseases
`(68–70).
`
`Methylation
`Research on methylation dates back to 1939 (29). Nonethe-
`less, just recently, with the identification of new methyl-
`transferases (such as protein arginine methyltransferases
`(PRMTs), and histone lysine methyltransferases (HKMTs)),
`has attracted more and more attention (71). Methylation is
`a reversible PTM, which often occurs in the cell nucleus and
`on the nuclear proteins such as histone proteins (1, 72).
`Methylation occurs on the Lys, Arg, Ala, Asn, Asp, Cys,
`Gly, Glu, Gln, His, Leu, Met, Phe and Pro residues in tar-
`get proteins (6). However, lysine and arginine are the two
`main target residues in methylation, at least in eukaryotic
`cells (73, 74). One of the most biologically important roles
`of methylation is in histone modification. Histone proteins,
`after synthesis of their polypeptide chains, are methylated
`
`Exhibit 2057
`Page 04 of 20
`
`

`

`Database, Vol. 00, Article ID baab012
`
`Page 5 of 20
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 3. Schematic illustration of the 10 most studied PTMs including Phosphorylation (A), Acetylation (B), Ubiquitylation (C), Methylation (D),
`N-glycosylation (E), O-glycosylation (F), SUMOylation (G), S-palmitoylation (H), N-myristoylation (I), Prenylation (J), and Sulfation (k).
`
`Exhibit 2057
`Page 05 of 20
`
`

`

`Page 6 of 20
`
`Database, Vol. 00, Article ID baab012
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 3. (continued)
`
`at Lys, Arg, His, Ala or Asn residues (75). Nε-lysine methy-
`lation is one of the most abundant histone modifications
`in eukaryotic chromatin, which includes transferring the
`methyl groups from S-adenosylmethionine to histone pro-
`teins via methyltransferase enzyme (Figure 3D). In eukary-
`otes, methylated arginine has been observed in histone and
`non-histone proteins (76).
`Recent studies have shown that methylation is associ-
`ated with fine tuning of various biological processes ranging
`
`from transcriptional regulation to epigenetic silencing via
`heterochromatin assembly (77). Defect in this modifica-
`tion can lead to various diseases such as cancer, men-
`tal retardation (Angelman syndrome), diabetes mellitus,
`lipofuscinosis and occlusive disease (12, 78, 79).
`
`Glycosylation
`is glyco-
`One of the most complex PTMs in the cell
`sylation, which is a reversible enzyme-directed reaction
`
`Exhibit 2057
`Page 06 of 20
`
`

`

`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Database, Vol. 00, Article ID baab012
`
`Page 7 of 20
`
`(12). Glycosylation occurs in multiple subcellular loca-
`tions, such as endoplasmic reticulum, the Golgi apparatus,
`cytosol and the sarcolemma membrane (80). Glycosyla-
`tion occurs in eukaryotic and prokaryotic membranes and
`secreted proteins, also nearly 50% of the plasma pro-
`teins are glycosylated (14). In this modification, oligosac-
`charide chains are linked to specific residues by covalent
`bond (see Figures 3E and F). This enzymatic process,
`which is catalyzed by a glycosyltransferase enzyme, usu-
`ally occurs in the side chain of residues such as Trp,
`Ala, Arg, Asn, Asp, Ile, Lys, Ser, Thr, Val, Glu, Pro,
`Tyr, Cys and Gly (6); however,
`it occurs more fre-
`quently on Ser, Thr, Asn and Trp residues in proteins and
`lipoproteins (13). According to the target residues, glyco-
`sylation can be classified into six groups: N-glycosylation,
`O-glycosylation, C-glycosylation, S-glycosylation, phos-
`phoglycosylation and glypiation (GPI-anchored) (5, 12).
`N-glycosylation and O-glycosylation are two major types
`of glycosylation and have important roles in the mainte-
`nance of protein conformation and activity (81).
`Glycosylation has a great role in many important bio-
`logical processes such as cell adhesion, cell–cell and cell–
`matrix interactions, molecular trafficking, receptor activa-
`tion, protein solubility effects, protein folding and signal
`transduction, protein degradation, and protein intracel-
`lular trafficking and secretion (9–14). It has been shown
`that the defect in this process has a significant effect on
`the development of various diseases like cancer, liver cir-
`rhosis, diabetes, HIV infection, Alzheimer’s disease and
`atherosclerosis (12, 14, 82).
`
`SUMOylation
`Small Ubiquitin-Related Modifier (SUMO) protein was pri-
`marily discovered in 1996 by Rohit Mahajan in the Ran
`GTPase-activating protein (RanGAP) (35). SUMOylation
`takes place via SUMO (83) that has a three-dimensional
`structure similar to ubiquitin protein and has been dis-
`covered in a wide range of eukaryotic organisms (84).
`SUMOylation can occur in both cytoplasm and nucleus on
`lysine residues (85). SUMO family has three isoforms in
`mammals, four isoforms in humans, two isoforms in yeasts
`and eight isoforms in plants (1). SUMOylation occurs as
`a modifier in ε-amino group of lysine residues in target
`protein through a multi-enzymatic cascade (86). In this
`reaction, SUMO is connected to a lysine residue in substrate
`protein by covalent linkage via three enzymes, namely acti-
`vating (E1), conjugating (E2) and ligase (E3). Also, it is
`separated from the target protein by a specific enzyme
`protease—SUMO (Figure 3G) (87). Often, SUMOylation
`modifications occur at a consensus motif WKxE (where W
`represents Lys, Ile, Val or Phe and X any amino acid) (88).
`
`SUMOylation plays a major role in many basic cellular
`processes like transcription control, chromatin organiza-
`tion, accumulation of macromolecules in cells, regulation
`of gene expression and signal transduction (89, 90). It is
`also necessary for the conservation of genome integrity
`(91). Also,
`there are many reports on major role of
`SUMOylation in development of a variety of human
`diseases including cancer, Alzheimer’s disease, Parkin-
`son’s disease, viral infections, heart diseases and diabetes
`(83, 91–93).
`
`Palmitoylation
`An important class of PTMs, called lipidation, includes
`covalent attachment of lipids to proteins. The first report of
`the covalent modification of proteins with lipids dates back
`to 1951 (94). These PTMs are taken place via a great vari-
`ety of lipids like octanoic acid, myristic acid, palmitic acid,
`palmitoleic acid, stearic acid, cholesterol, etc. Myristoyla-
`tion, palmitoylation and prenylation can be considered as
`the three main types of these lipid modifications (95, 96).
`Palmitoylation is described in this subsection, and the
`other two important ones are described in the subsequent
`subsections.
`Palmitoyltransferases (PATs) were first identified in yeast
`in 1999 by Doug J. Bartels (36). Palmitoylation is the
`covalent attachment of fatty acids, like palmitic acid on
`the Cys, Gly, Ser, Thr and Lys (6). S-palmitoylation con-
`tains a reversible covalent addition of a 16-carbon fatty
`acid chains, palmitate, to a cysteine via a thioester linkage
`(Figure 3H) (97). Palmitoyl-CoA (as the lipid substrate) is
`attached to the target protein by a PAT and removed via
`acyl protein thioesterases (98).
`Mostly, S-palmitoylation occurs in eukaryotic cells and
`plays critical roles in many different biological processes
`including protein function regulation, protein–protein
`interaction, membrane–protein associations, neuronal
`development, signal transduction, apoptosis and mitosis
`(98–100). Dysfunction of palmitoylation has been linked
`to many diseases including neurological diseases (Hunting-
`ton’s disease, schizophrenia and Alzheimer’s disease) and
`different cancers (101–105).
`
`Myristoylation
`Myristoylation (N-myristoylation) was discovered by Alas-
`tair Aitken in 1982,
`in bovine brain (34). Although
`often refers to myristoylation as a PTM, it usually occurs
`co-translationally (106). This modification is an irreversible
`PTM that occurs mainly on cytoplasmic eukaryotic pro-
`teins. Myristoylation has been reported in some integral
`membrane proteins as well (107). Myristoylation happens
`approximately in 0.5–1.5% of eukaryotic proteins (108).
`
`Exhibit 2057
`Page 07 of 20
`
`

`

`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Page 8 of 20
`
`Database, Vol. 00, Article ID baab012
`
`In myristoylation after removal of the initiating Met, a
`14-carbon saturated fatty acid, called myristic acid,
`is
`attached to the N-terminal glycine residue via a covalent
`bond (Figure 3I) (109). This attachment is often observed
`in Met-Gly-X-X-X- Ser/Thr motif and is catalyzed by an
`N-myristoyl transferase (NMT) (there are at least two
`types of NMT enzymes, NMT1 and NMT2, in humans)
`(109, 110). Myristoylation occurs more frequently on Gly
`and less frequently on Lys residues (6).
`Proteins that undergo this PTM play critical roles in reg-
`ulating the cellular structure and many biological processes
`such as stabilizing the protein structure maturation, signal-
`ing, extracellular communication, metabolism and regula-
`tion of the catalytic activity of the enzymes (109, 110). The
`role of myristoylation has been proved in the development
`and progression of various diseases such as cancer, epilepsy,
`Alzheimer’s disease, Noonan-like syndrome, and viral and
`bacterial infections (111).
`
`Prenylation
`The first study on prenylation was done in 1978 by Yuji
`KamiIya et al. in yeast (33). It is another important lipid-
`based PTM, which occurs after translation as an irre-
`versible covalent linkage mainly in the cytosol (112). This
`reaction occurs on cysteine and near the carboxyl-terminal
`end of the substrate protein (113). Prenylation has two
`main forms: farnesylation and geranylation (114). These
`two forms contain the addition of two different types of iso-
`prenoids to cysteine residues: farnesyl pyrophosphate (15-
`carbon) and geranylgeranyl pyrophosphates (20-carbon),
`respectively. In prenylated proteins, one can find a consen-
`sus motif at the C-terminal; the motif is CAAX where C is
`cysteine, A is an aliphatic amino acid and X is any amino
`acid (115). This process is catalyzed by three prenyltrans-
`ferase enzymes: farnesyltransferase (FT) and two geranyl
`transferases (Figure 3J) (GT1 and GT2) (48).
`The prenylation is known as a crucial physiological
`process for facilitating many cellular processes such as
`protein–protein interactions, endocytosis regulation, cell
`growth, differentiation, proliferation and protein traffick-
`ing (115–117). Observations showed that disruption in
`this modification plays crucial roles in the pathogenesis
`of cancer (114), cardiovascular and cerebrovascular dis-
`orders, bone diseases, progeria, metabolic diseases and
`neurodegenerative diseases (118, 119).
`
`Sulfation
`Sulfation was first discovered by Bruno Bettelheim in
`bovine fibrinopeptide bin in 1954 (120). Residues Tyr, Cys,
`and Ser have been identified as target residues for preny-
`lated proteins (6). Often, the target residue of this PTM
`
`is tyrosine, which happens in the trans-Golgi network.
`N-sulfation or O-sulfation includes the addition of a neg-
`atively charged sulfate group by nitrogen or oxygen to an
`exposed tyrosine residue on the target protein (121, 122).
`Currently, PTS is observed mainly in secreted and trans-
`membrane proteins in multicellular eukaryotes and have
`not yet been observed in nucleic and cytoplasmic proteins
`(121). This reaction is catalyzed by two transmembrane
`enzymes, tyrosyl protein sulfotransferases 1 and 2 (TPST1
`and TPST2) (30). TPSTs govern the transfer of an acti-
`vated sulfate from 3-phospho adenosine 5-phosphosulfate
`to tyrosine residues within acidic motifs of polypeptides
`(Figure 3K) (121).
`Recently, it has been observed that PTS has vital roles
`in many biological processes like protein–protein interac-
`tions, leukocyte rolling on endothelial cells, visual functions
`and viral entry into cells (123). This PTM involves in many
`diseases like autoimmune diseases, HIV, lung diseases and
`multiple sclerosis (12).
`
`Involvement of PTMs in diseases and
`biological processes
`PTMs have a vital role in almost all biological processes
`and fine-tune numerous molecular functions. Therefore,
`the footprints of disruption in PTMs can be seen in many
`diseases. Figure 4A shows a tripartite network of PTM
`involvement in diseases and biological processes for the 10
`abovementioned PTMs. This network contains 97 diseases
`and 153 biological processes. Panels B and C in Figure 4
`show the biological processes with degree ≥3 (those bio-
`logical processes that interact with at least three different
`PTMs) and diseases with degree ≥2, respectively.
`As it is shown in Figure 4C, neurodegenerative disease is
`the major group of diseases, which is affected by the disrup-
`tion in the PTMs (Alzheimer’s disease, Parkinson’s disease
`and Huntington’s disease). Besides, one can see that cancer
`is also one of the most affected diseases. Consistently with
`this observation, the biological processes related to cancer
`are among the high-degree nodes (signaling, DNA repair,
`control of replication and apoptosis). Processes related to
`apoptosis, protein–protein interaction, signaling, cell cycle
`control, chromatin assembly, organization and stability,
`DNA repair, protein degradation, protein trafficking and
`targeting, regulation of gene expression and transcrip-
`tion control are the other high-degree biological processes.
`Moreover, we can say that ubiquitylation, prenylation, gly-
`cosylation, S-palmitoylation and SUMOylation have the
`most involvement in diseases. On the other hand, the PTMs
`with the highest number of interactions with biological pro-
`cesses are phosphorylation, ubiquitylation, methylation,
`acetylation and SUMOylation. Putting all together, we can
`
`Exhibit 2057
`Page 08 of 20
`
`

`

`Database, Vol. 00, Article ID baab012
`
`Page 9 of 20
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 4. Involvement of PTMs in diseases and biological processes. (A). Tripartite network of PTM involvement in diseases and biological processes
`for the 10 major PTMs. (B) The degree of the biological processes with degree ≥3 in the tripartite network. (C) The degree of the diseases with
`degree ≥2 in the tripartite network. (D) Involvement of PTMs in disease and biological processes.
`
`conclude that the disruption in the pathways of these five
`PTMs has a great impact on the normal functioning of the
`cell and, as the result, on the organisms
`
`Main PTM databases
`Due to the considerable cost and difficulties of experimental
`methods for identifying PTMs, recently many computa-
`tional methods have been developed for predicting PTMs
`(124). Almost all of these methods need a set of experimen-
`tally validated PTMs to build a prediction model. There-
`fore, the availability of valid public databases of PTMs
`is the first step toward this end. There are a variety of
`
`such public databases that could be utilized easily by the
`scientific community for developing computational meth-
`ods (17, 124).
`According to the scope and diversity of the covered
`PTMs, these databanks can be classified into two main
`groups: general databases and specific databases. The gen-
`eral databases contain different types of PTMs, regardless
`of target residue and organisms. These databases provide
`a broad scope of information for various PTMs. On the
`other hand, specific databases have been created based
`on some certain types of PTMs, certain characteristics of
`PTMs and/or specific target residues.
`
`Exhibit 2057
`Page 09 of 20
`
`

`

`Page 10 of 20
`
`Database, Vol. 00, Article ID baab012
`
`Downloaded from https://academic.oup.com/database/article/doi/10.1093/database/baab012/6214407 by Arnold and Porter user on 17 January 2022
`
`Figure 5. Bubble chart for PTM databases. The chart was drawn based on three parameters for the databases: the number of stored modified proteins,
`the number of modified sites and the number of covered PTM types.
`
`The current public PTM databases are greatly differ-
`ent in the number of stored modified proteins, the number
`of modified sites and the number of covered PTM types.
`Figure 5 shows a bubble chart of main PTM databases
`according to these three parameters. As it is evident from
`the figure, due to the extensive number of studies on phos-
`phorylation, the specific databases are mainly focused on
`phosphorylation. From this point of view, glycosylation
`is the second most interested PTM. In the following, the
`five largest databases are described briefly. Also, Table 1
`summarizes the current main public PTM databases.
`The EPSD (Eukaryotic Phosphorylation Site Database)
`contains the largest number of PTM sites. EPSD contains
`more than 1 600 000 experimental phosphorylation sites in
`more than 209 000 phosphoproteins across 68 eukaryotes,
`including 18 animals, 7 protists, 24 plants and 19 fungi
`(125).
`dbPTM (Database Post-translational modification) is a
`comprehensive database that has collected experimental
`PTMs’ data from 30 public databases and 92 648 research
`articles. dbPTM contains ∼908 000 experimentally verified
`sites for more than 130 types of PTMs from different organ-
`isms (6). This database is the largest database in terms of
`the number of recorded proteins and also in terms of the
`number of stored PTM types (Figure 5).
`
`BioGRID (The Biological General Repository for Inter-
`action Datasets)
`is another major open access PTM
`database. In addition to protein and genetic interactions,
`it also holds data on ∼726 000 phosphorylation sites in
`∼ 72 000 proteins, which were extracted from 4742 pub-
`lications for 71 major model organisms (126).
`PSP (PhosphoSitePlus) is an online resource for study-
`ing experimentally observed PTMs such as phosphoryla-
`tion, ubiquitinylation and acetylation. PSP is comprised
`of ∼484 000 PTM sites for more than 7 PTM types from
`26 species. However, the major amount of its data are
`extracted from human, mouse and rat (127).
`The qPTM database contains 10 types of PTMs for
`∼296 900 sites in more than 19 600 proteins under 661
`conditions that are collected and integrated into a database
`(128).
`
`Computational methods for predicting PTMs
`Generally speaking, any computational method for pre-
`dicting a specific type of PTM has four main steps: data
`gathering, feature extraction, learning the predictor and
`performance assessment. These steps have been schemat-
`ically shown in Figure 6. In the following, these steps
`are described in detail. Also, the related challenges and
`problems in each step are discussed as well.
`
`Exhibit 2057
`Page 10 of 20
`
`

`

`Downloaded from https://acade

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket