`
`Proteases: a primer
`
`Nigel M. Hooper1
`Proteolysis Research Group, School of Biochemistry and Molecular
`Biology, University of Leeds, Leeds LS2 9JT, U.K.
`
`Abstract
`
`A protease can be defined as an enzyme that hydrolyses peptide bonds.
`Proteases can be divided into endopeptidases, which cleave internal peptide
`bonds in substrates, and exopeptidases, which cleave the terminal peptide
`bonds. Exopeptidases can be further subdivided into aminopeptidases and
`carboxypeptidases. The Schechter and Berger nomenclature provides a model
`for describing the interactions between the peptide substrate and the active site
`of a protease. Proteases can also be classified as aspartic proteases, cysteine
`proteases, metalloproteases, serine proteases and threonine proteases,
`depending on the nature of the active site. Different inhibitors can be used
`experimentally to distinguish between these classes of protease. The MEROPs
`database groups proteases into families on the basis of similarities in sequence
`and structure. Protease activity can be regulated in vivo by endogenous
`inhibitors, by the activation of zymogens and by altering the rate of their
`synthesis and degradation.
`
`Introduction
`
`For many researchers, proteases are often considered to be unwanted
`biological pests. Many do their utmost to inactivate proteases in order to
`prevent the breakdown of their particular protein of interest. For others,
`proteases are tools to be used to selectively destroy or chop up a protein prior
`to its further analysis, e.g. the digestion of a protein with trypsin prior to
`sequencing the smaller tryptic fragments. However, more and more
`
`1E-mail: n.m.hooper@leeds.ac.uk
`
`Copyright 2002 Biochemical Society
`
`1
`
`AstraZeneca Exhibit 2093
`Mylan v. AstraZeneca
`IPR2015-01340
`
`Page 1 of 8
`
`
`
`2
`
`Essays in Biochemistry volume 38 2002
`
`researchers are recognizing that proteases are often key players in a wide range
`of biological processes; for example, in regulating the cell cycle, cell growth
`and differentiation, antigen processing and angiogenesis. In addition, it is
`becoming apparent that the aberrant functioning of certain proteases may be
`involved in several disease states, including Alzheimer’s disease, in cancer
`metastasis and in inflammation. An understanding of the role played by
`proteases in these processes may provide the opportunity for therapeutic
`intervention, and inhibitors of certain proteases have already proved to be
`effective therapeutic agents in hypertension and heart failure, some forms of
`cancer, and against certain viruses. The aim of this volume is to highlight some
`of the more recent developments in this area and to provide an insight into the
`future of protease research. When one considers that almost 2% of the human
`genome codes for proteases [1], it is clear that there is a lot still to be learned.
`The remainder of this chapter is devoted to a brief introduction to the basic
`terminology used in protease biology.
`
`Definition of proteases
`
`A protease is defined as an enzyme that hydrolyses one or more peptide bonds
`(Figure 1) in a protein or peptide [2]. Thus, proteases can, potentially, degrade
`anything containing a peptide bond, from a dipeptide up to a large protein
`containing thousands of amino acids. However, many proteases have a preference
`for protein substrates, while others will only cleave short peptides or even just
`dipeptides. As these enzymes hydrolyse peptide bonds, some have argued that
`they all should be termed ‘peptidases’, and that the term ‘protease’ be restricted to
`those peptidases that hydrolyse proteins. Other commonly found terms in the
`literature include ‘proteinase’ and ‘proteolytic enzyme’.
`
`Cleavage-site specificity
`
`The terminology used in describing the cleavage-site specificity of proteases is
`based on a model proposed by Schechter and Berger [3]. In this model, the
`catalytic site is considered to be flanked on one or both sides by specificity
`subsites, each of which is able to accommodate the side chain of a single amino
`acid residue (Figure 2). By convention, the substrate amino-acid residues are
`called P (for peptide) and the subsites on the protease that interact with the
`substrate are called S (for subsite). The subsites are numbered outwards from
`the catalytic site, S1, S2, S3, etc. towards the N-terminus of the substrate, and
`S1⬘, S2⬘, S3⬘, etc. towards the C-terminus (Figure 2). The side chains of the
`amino-acid residues in the substrate that these sites accommodate are
`numbered P1, P2, etc. and P1⬘, P2⬘, etc., outwards from the scissile peptide
`bond (see Figure 2). The residues are usually not numbered beyond P6 on
`either side of the scissile bond. Different proteases have different requirements
`for subsite interactions to determine the specificity of cleavage. For example,
`the S1 subsite of trypsin has a marked preference for the binding of basic
`
`Copyright 2002 Biochemical Society
`
`Page 2 of 8
`
`
`
`N.M. Hooper
`
`3
`
`C
`
`O
`
`O⫺
`
`⫹ H2O
`
`R2
`
`C H
`
`N
`
`H
`
`O C
`
`C
`
`O
`
`O⫺
`
`R2
`
`C H
`
`⫹N
`
`H3
`
`⫹
`
`C
`
`O
`
`O⫺
`
`R1
`
`C
`
`H
`
`R1
`
`C
`
`H
`
`⫹N
`
`H3
`
`⫹N
`
`H3
`
`Figure 1. Peptide bond hydrolysis by a protease
`
`amino acid residues (arginine and lysine), while interactions with several of the
`subsites further away from the scissile bond are critical for substrate binding to
`renin, the protease involved in the renin–angiotensin system (see Chapter 10),
`and to the caspases, proteases involved in apoptosis (see Chapter 2).
`
`Classification of proteases
`
`Why classify proteases? First, classification aids researchers and students in
`understanding the terminology in this large, and often confusing, field of
`research. Secondly, the grouping together of enzymes in families on the basis
`of sequence and structural information aids in the elucidation of common
`catalytic, biosynthetic processing and regulatory mechanisms. Finally, such
`classification is invaluable in elucidating the function of newly identified
`proteases. This is particularly relevant in the context of proteases that are
`
`S3
`
`P3
`
`S2
`
`P2
`
`S1
`
`P1
`
`S1⬘
`
`P1⬘
`
`S2⬘
`
`P2⬘
`
`S3⬘
`
`P3⬘
`
`–NH–CH–CO–NH–CH–CO–NH–CH–CO–NH–CH–CO–NH–CH–CO–NH–CH–CO–
`
`Figure 2. The Schechter and Berger [3] nomenclature for binding of a peptide
`substrate to a protease
`The protease is represented as the blue shaded area. P1, P1⬘, etc. are the side chains of the
`six amino acids surrounding the peptide bond to be cleaved (indicated by the arrow) in the
`substrate. S1, S1⬘, etc. are the corresponding subsites on the protease.
`
`Copyright 2002 Biochemical Society
`
`Page 3 of 8
`
`
`
`4
`
`Essays in Biochemistry volume 38 2002
`
`initially identified on the basis of sequence similarity from screening genome
`databases [1], rather than in the more traditional way of isolating an activity
`that cleaves a particular protein or peptide substrate, followed by its
`purification and experimental characterization. Mining of genome databases
`for novel proteases is dealt with in more detail in Chapter 14.
`Proteases can be classified on the basis of the position within a peptide of
`the peptide bond that is cleaved. Thus, endopeptidases cleave internal peptide
`bonds, while exopeptidases cleave the terminal bonds (Figure 3). Exopeptidases
`can be further subdivided into aminopeptidases or carboxypeptidases, depend-
`ing on whether they cleave the N-terminal or C-terminal peptide bond respec-
`tively (Figure 3). Proteases are also classified on the basis of the catalytic mecha-
`nism, that is, the nature of the amino acid residue or cofactor at the active site
`that is involved in the hydrolytic reaction. Thus, aspartic proteases, such as the
`HIV protease (see Chapter 9) and renin (see Chapter 10), have a critical aspar-
`tate residue that is involved in catalysis. Metalloproteases have a bivalent metal
`ion, usually zinc but sometimes cobalt, iron or manganese, at the active site.
`Examples of metalloproteases include the matrix metalloproteases (see Chapter
`3), methionine aminopeptidases (see Chapter 6), angiotensin-converting
`enzyme and neprilysin (see Chapter 10), and the ADAMs (a disintegrin and
`metalloproteinase domain) family of proteases (see Chapter 11). In the aspartic
`and metalloproteases, the nucleophile that attacks the peptide bond of the sub-
`strate is an activated water molecule, whereas in the other protease groups the
`nucleophile is part of an amino acid at the catalytic site of the protease.
`Those proteases in which the nucleophile is the sulphydryl group of a cys-
`teine residue are termed cysteine proteases, typified by the caspases that are
`involved in programmed cell death (see Chapter 2). In serine proteases the cat-
`alytic mechanism depends upon the hydroxy group of a serine residue acting as
`the nucleophile that attacks the peptide bond. Examples of serine proteases
`include chymotrypsin and trypsin (digestive enzymes of the intestine), and the
`proteases involved in the blood clotting cascade (see Chapter 8). In a small
`number of proteases the catalytic mechanism depends on the hydroxy group of
`a threonine residue, the so-called threonine proteases. These are exemplified by
`
`endopeptidase
`
`exopeptidase
`
`exopeptidase
`
`Met – Ala – Glu – Phe – Tyr – Lys – Phe – Leu
`
`aminopeptidase
`
`carboxypeptidase
`
`Figure 3. Cleavage site specificity of proteases
`
`Copyright 2002 Biochemical Society
`
`Page 4 of 8
`
`
`
`N.M. Hooper
`
`5
`
`the catalytic subunit of the proteasome (see Chapter 5). At present, 3% of the
`proteases in the human genome have been identified as aspartic proteases, 23%
`as cysteine proteases, 36% as metalloproteases and 32% as serine proteases [1].
`Over the past decade, Alan Barrett and his colleagues in Cambridge, U.K.,
`have developed a more detailed classification system for proteases, the
`MEROPS database [4–6]. This is available online (www.merops.ac.uk) or in
`hard copy as the Handbook of Proteolytic Enzymes [7]. In the MEROPS data-
`base, proteases are classified by structural similarities in the parts of the mole-
`cules that are responsible for their enzymic activity. They are grouped into
`families on the basis of amino-acid sequence homology, and the families are
`assembled into clans based on evidence, usually similarities in tertiary struc-
`ture, that they share a common ancestry (Figure 4). This classification forms a
`framework around which a wealth of supplementary information about the
`proteases is organized, including images of three-dimensional structures,
`amino-acid sequence alignments, comments on biomedical relevance, and liter-
`ature references. A set of online searches provides access to information about
`the location of proteases on human chromosomes and their substrate specifi-
`city. As the MEROPS database is updated regularly, it provides an extremely
`valuable resource for protease researchers (see also Chapter 14).
`
`Inhibition of proteases
`
`The four major classes of proteases (aspartic, cysteine, metallo and serine) can
`be distinguished experimentally using class-specific inhibitors (Table 1). For
`example, chelators such as EDTA or 1,10-phenanthroline remove the critical
`metal ion from the catalytic site of metalloproteases, thereby inactivating them.
`
`(a)
`Catalytic type
`
`(b)
`
`Serine protease
`
`Clan
`
`Clan SA
`
`Clan SB
`
`Family
`
`Family S1
`
`Family S29
`
`Family S8
`
`Protease
`(unique identifier no.)
`
`Trypsin
`(S01.151)
`
`Hepatitis C virus
`Chymotrypsin
`(S01.152)
`polyprotein peptidase
`(S29.001)
`
`Furin
`(S08.071)
`
`Subtilisin
`(S08.001)
`
`Figure 4. Overview of the MEROPS protease classification system for proteases
`(a) Summary showing the relationship between catalytic type (aspartic, cysteine, metallo or ser-
`ine), clan, family and individual protease within the MEROPS database. (b) Schematic showing the
`relationship of five serine proteases within the MEROPS database. For more information see [7].
`
`Copyright 2002 Biochemical Society
`
`Page 5 of 8
`
`
`
`6
`
`Essays in Biochemistry volume 38 2002
`
`On the other hand, di-isopropyl fluorophosphate binds irreversibly to the
`serine residue at the catalytic site of serine proteases, thus permanently
`inactivating the enzyme. However, not every protease is susceptible to
`inhibition by one of these more general class-specific inhibitors. For example,
`the recently discovered aspartic protease involved in the -secretase cleavage of
`the Alzheimer’s disease amyloid precursor protein is not inhibited by
`pepstatin [8] (see Chapter 4).
`These class-specific inhibitors (Table 1) can be used experimentally to
`identify the catalytic type that a particular protease belongs to, and are particu-
`larly useful in the absence of amino-acid sequence information. In addition,
`knowledge of the different protease classes and their respective inhibitors is
`often useful in designing a strategy to block the breakdown of a particular pro-
`tein in a sample by the inhibition of either as many proteases as possible or a
`discrete subgroup of proteases. To this end, inhibitor cocktails, consisting of
`mixtures of the compounds in Table 1, are available from a number of com-
`mercial suppliers.
`
`Regulation of protease activity
`
`In addition to the inhibition of proteases in vitro, there are several ways in
`which the activities of proteases can be regulated in vivo. These include
`inhibition by endogenous inhibitors, which are often proteins themselves, such
`as the antitrypsin inhibitor that binds to prematurely activated trypsin in the
`pancreas, the serine protease inhibitors and Kunitz-type protease inhibitors
`that inhibit, amongst others, some of the serine proteases involved in blood
`clotting (see Chapter 8), the tissue inhibitors of metalloproteases that inhibit
`the matrix metalloproteases (see Chapter 3), and the inhibitor of apoptosis
`proteins that inhibits the caspases (see Chapter 2).
`Protease activity can also be regulated in other ways. The rate of synthesis
`and/or the rate of degradation will determine the amount of a particular pro-
`tease present at any one time. Such mechanisms are often used to restrict the
`expression and activity of a protease to a particular tissue or stage of develop-
`ment. Many proteases are first synthesized in an inactive pro-form, often
`termed a zymogen, which is itself proteolytically cleaved to the active protease,
`e.g. trypsinogen is activated to form the digestive enzyme trypsin. Other
`examples include the matrix metalloproteases (see Chapter 3), the ADAMs
`family (see Chapter 11) and the serine proteases involved in blood clotting (see
`Chapter 8). The latter case is an excellent example of a series of zymogen acti-
`vations finely regulating a biological process. Several zymogens are activated
`by the removal of their prodomain by serine proteases of the furin/pro-hor-
`mone convertase family that cleave at pairs of dibasic residues (see Chapter 7).
`
`Copyright 2002 Biochemical Society
`
`Page 6 of 8
`
`
`
`N.M. Hooper
`
`7
`
`Table 1. Class-specific protease inhibitors
`The mode of inhibition is indicated as either reversible (R) or irreversible (I). More information on individual inhibitors can be found in [9] or the supplier’s
`catalogues.
`Class of
`protease
`inhibited
`Acidic
`
`Effective
`Mode concentration Comments
`1 M
`R
`
`Inhibitor
`Pepstatin A
`
`Metallo
`
`Cysteine
`
`Serine
`
`EDTA
`1,10-Phenanthroline
`Bestatin
`Phosphoramidon
`
`R
`R
`R
`R
`
`trans-Epoxysuccinyl-L-leucylamido-(4-guanidino)butane (E-64) I
`Iodoacetamide
`I
`Leupeptin
`R
`
`Aprotinin
`4-(2-Aminoethyl)benzenesulphonyl fluoride (AEBSF)
`Di-isopropyl fluorophosphate (DIPF)
`Phenylmethylsulphonylfluoride (PMSF)
`
`R
`I
`I
`I
`
`1–10 mM
`1–10 mM
`1–10 M
`1–10 M
`
`1–10 M
`10–100 M
`10–100 M
`
`2–10 g/ml
`0.1–1 mM
`0.1 mM
`0.1–1 mM
`
`Particularly effective against zinc metalloproteases.
`Mainly selective for aminopeptidases.
`Inhibits thermolysin- and neprilysin-like proteases.
`
`Can react with non-active-site cysteine residues.
`Also inhibits some serine proteases.
`
`Is itself a protein.
`More stable in aqueous solution than PMSF.
`Extremely toxic. Half-life in aqueous solution for 1h at pH 7.5.
`Half-life in aqueous solution for 1h at pH 7.5.
`
`Copyright 2002 Biochemical Society
`
`Page 7 of 8
`
`
`
`8
`
`Summary
`
`Essays in Biochemistry volume 38 2002
`
`•
`
`•
`
`• A protease is an enzyme that hydrolyses peptide bonds.
`• The Schechter and Berger nomenclature [3] provides a model for
`describing the interactions between the peptide substrate and the active
`site of a protease.
`Proteases can be subdivided into endopeptidases and exopeptidases,
`depending on the position in the substrate of the peptide bond that is
`hydrolysed.
`Proteases are classified depending on the nature of the active site
`into aspartic proteases, cysteine proteases, metalloproteases and serine
`proteases.
`• Different inhibitors can be used experimentally to distinguish between
`these four classes of protease.
`• The MEROPS database groups proteases into families on the basis of
`similarities in sequence and structure.
`Protease activity can be regulated in vivo by endogenous inhibitors,
`activation of zymogens, and by altering the rate of their synthesis and
`degradation.
`
`•
`
`References
`Southan, C. (2001) A genomic perspective on human proteases. FEBS Lett. 498, 214–218
`1.
`2.
`Smith, A.D., Datta, S.P., Howard Smith, G., Campbell, P.N., Bentley, R. & McKenzie, H.A. (1997)
`Oxford Dictionary of Biochemistry and Molecular Biology, Oxford University Press, Oxford
`Schechter, I. & Berger, A. (1967) On the size of the active site in proteases. I. Papain. Biochem.
`Biophys. Res. Commun. 27, 157–162
`Rawlings, N.D. & Barrett, A.J. (1993) Evolutionary families of peptidases. Biochem. J. 290, 205–218
`Rawlings, N.D. & Barrett, A.J. (1999) MEROPS: the peptidase database. Nucleic Acids Res. 27, 1–7
`Barrett, A.J., Rawlings, N.D. & O’Brien, E.A. (2001) The MEROPS database as a protease informa-
`tion system. J. Struct. Biol. 134, 95–102
`Barrett, A.J., Rawlings, N.D. & Woessner, J.F. (1998) Handbook of Proteolytic Enzymes, Academic
`Press, San Diego, CA
`Vassar, R., Bennett, B.D., Babu-Khan, S., Kahn, S., Mendiaz, E.A., Denis, P., Teplow, D.B., Ross, S.,
`Amarante, P., Leoloff, R. et al. (1999) -Secretase cleavage of Alzheimer’s amyloid precursor pro-
`tein by the transmembrane aspartic protease BACE. Science 286, 735–741
`Beynon, R.J. & Bond, J.S. (1989) Proteolytic Enzymes: A Practical Approach, IRL Press, Oxford
`
`3.
`
`4.
`5.
`6.
`
`7.
`
`8.
`
`9.
`
`Copyright 2002 Biochemical Society
`
`Page 8 of 8