`
`FIFTH EDITION
`
`Jeremy M. Berg
`
`John L. Tymoczko
`
`Lubert Stryer
`
`44getY/40.--41-
`
`Regeneron Exhibit 2010
`Page 01 of 22
`
`
`
`BIOCHEMISTRY
`
`e FIFTH EDITION 0
`
`Jeremy M. Berg
`Johns Hopkins University School of Medicine
`
`John L. Tymoczko
`Carleton College
`
`Lubert Stryer
`Stanford University
`
`Web content by
`Neil D. Clarke
`Johns Hopkins University School of Medicine
`
`W. H. Freeman and Company
`New York
`II
`
`Regeneron Exhibit 2010
`Page 02 of 22
`
`
`
`TO OUR TEACHERS AND OUR STUDENTS
`
`About the cover: The back cover shows a complex between an aminoacyl-transfer RNA
`molecule and the elongation factor EF-Tu.
`
`PUBLISHER: Michelle Julet
`DEVELOPMENT EDITOR: Susan Moran
`NEW MEDIA AND SUPPLEMENTS EDITOR: Mark Santee
`NEW MEDIA DEVELOPMENT EDITOR: Sonia DiVittorio
`MEDIA DEVELOPERS: CADRE design; molvisions.com-
`3D molecular visualization
`MARKETING DIRECTOR: John Britch
`MARKETING MANAGER: Carol Coffey
`PROJECT EDITOR: Georgia Lee Hadler
`MANUSCRIPT EDITOR: Patricia Zimmerman
`COVER AND TEXT DESIGN: Victoria Tomaselli
`COVER ILLUSTRATION: Tomo Narashima
`ILLUSTRATION COORDINATOR: Cecilia Varas
`ILLUSTRATIONS: Jeremy Berg with Network Graphics
`PHOTO EDITOR: Vikii Wong
`PHOTO RESEARCHER: Dena Betz
`PRODUCTION COORDINATOR: Julia DeRosa
`COMPOSITION: TechBooks
`MANUFACTURING: RR Donnelley & Sons Company
`
`Library of Congress Cataloguing-in-Publication Data
`Berg, Jeremy Mark
`Biochemistry / Jeremy Berg, John Tymoczko, Lubert Stryer. —5th ed.
`p. cm.
`Fourth ed. by Lubert Stryer.
`Includes bibliographical references and index.
`ISBN 0-7167-4955-6 (CH32-34 only)
`ISBN 0-7167-3051-0 (CH1-34)
`ISBN 0-7167-4684-0 (CH1-34, International edition)
`ISBN 0-7167-4954-8 (CH1 -31 only)
`1. Biochemistry. I. Tymoczko, John L. II. Stryer, Lubert. III. Title.
`
`QP514.2 .S66 2001
`572—dc21
`
`2001051259
`
`(C.' 2002 by W. H. Freeman and Company; Cpl 1975, 1981, 1988,
`1995 by Lubert Stryer. All rights reserved
`
`No part of this book may be reproduced by any mechanical, photographic, or
`electronic process, or in the form of a phonographic recording, nor may it be stored in
`a retrieval system, transmitted, or otherwise copied for public or private use, without
`written permission from the publisher.
`
`Printed in the United States of America
`
`First printing 2001
`
`W. H. Freeman and Company
`41 Madison Avenue, New York, New York 10010
`Houndmills, Basingstoke RG21 6XS, England
`
`Regeneron Exhibit 2010
`Page 03 of 22
`
`
`
`CHAPTER 2 • Biochemical Evolution
`40 I
`3. Selective advantage. Suppose that a replicating RNA mole-
`cule has a mutation (genotypic change) and the phenotypic re-
`sult is that it binds nucleotide monomers more tightly than do
`other RNA molecules in its population. What might the selec-
`tive advantage of this mutation be? Under what conditions
`would you expect this selective advantage to be most important?
`
`4. Opposite of randomness. Ion gradients prevent osmotic
`crises, but they require energy to be produced. Why does the
`formation of a gradient require an energy input?
`
`5. Coupled gradients. How could a proton gradient with a
`higher concentration of protons inside a cell be used to pump
`ions out of a cell?
`
`6. Proton counting. Consider the reactions that take place across
`a photosynthetic membrane. On one side of the membrane, the
`following reaction takes place:
`
`4 e- + 4 A- + 4 H20 —* 4 AH + 4 OH-
`
`r
`
`Need extra help? Purchase chapters of the Student Com-
`panion with complete solutions online at www.whfreeman.com/
`biochem5.
`
`whereas, on the other side of the membrane, the reaction is:
`02 ± 4 e + 4 H+
`
`2H20
`How many protons are made available to drive ATP synthesis
`for each reaction cycle?
`
`7. An alternative pathway. To respond to the availability of sug-
`ars such as arabinose, a cell must have at least two types of pro-
`teins: a transport protein to allow the arabinose to enter the cell
`and a gene-control protein, which binds the arabinose and mod-
`ifies gene expression. To respond to the availability of some very
`hydrophobic molecules, a cell requires only one protein. Which
`one and why?
`8. How many divisions? In the development pathway of
`C. elegans, cell division is initially synchronous—that is, all cells I
`divide at the same rate. Later in development, some cells divide
`more frequently than do others. How many times does each cell
`divide in the synchronous period? Refer to Figure 2.26.
`
`Protein Structure and Function
`
`( 11
`
`■
`
`J.
`
`:ON "
`ap
`
`mi 1
`IOW
`(1:1
`Vit•
`
`Crystals of human insulin. Insulin is a protein hormone, crucial for
`maintaining blood sugar at appropriate levels. (Below) Chains of amino
`acids in a specific sequence (the primary structure) define a protein
`like insulin. These chains fold into well-defined structures (the tertiary
`structure)—in this case a single insulin molecule. Such structures
`assemble with other chains to form arrays such as the complex of six
`insulin molecules shown at the far right (the quarternary structure).
`These arrays can often be induced to form well-defined crystals (photo
`at left), which allows determination of these structures in detail.
`[(Left) Alfred Pasieka/Peter Arnold.]
`
`AIL
`" MreibilVP
`4. 41
`
`/ 7(4
`
`Quartemary
`structure
`
`OUTLINE
`
`C
`Tertiary
`Secondary
`Primary
`structure
`structure
`structure
`Proteins are the most versatile macromolecules in living systems and serve
`crucial functions in essentially all biological processes. They function as cat-
`alysts, they transport and store other molecules such as oxygen, they pro-
`vide mechanical support and immune protection, they gen-
`erate movement, they transmit nerve impulses, and they
`control growth and differentiation. Indeed, much of this
`text will focus on understanding what proteins do and how
`they perform these functions.
`Several key properties enable proteins to participate in
`such a wide range of functions.
`1. Proteins are linear polymers built of monomer units called
`amino acids. The construction of a vast array of macromol-
`ecules from a limited number of monomer building blocks
`is a recurring theme in biochemistry. Does protein func-
`tion depend on the linear sequence of amino acids? The
`function of a protein is directly dependent on its three-
`dimensional structure (Figure 3.1). Remarkably, proteins
`spontaneously fold up into three-dimensional structures
`that are determined by the sequence of amino acids in the
`protein polymer. Thus, proteins are the embodiment of the
`transition from the one-dimensional world of sequences to the
`three-dimensional world of molecules capable of diverse
`activities.
`2. Proteins contain a wide range of functional groups. These
`functional groups include alcohols, thiols, thioethers, carboxylic
`
`3.1 Proteins Are Built from a Repertoire
`of 20 Amino Acids
`3.2 Primary Structure: Amino Acids Are
`Linked by Peptide Bonds to Form
`Polypeptide Chains
`3.3 Secondary Structure: Polypeptide
`Chains Can Fold into Regular Structures
`Such as the Alpha Helix, the Beta Sheet,
`and Turns and Loops
`3.4 Tertiary Structure: Water-Soluble
`Proteins Fold into Compact Structures
`with Nonpolar Cores
`3.5 Quaternary Structure: Polypeptide
`Chains Can Assemble into Multisubunit
`Structures
`3.6 The Amino Acid Sequence
`of a Protein Determines Its Three-
`Dimensional Structure
`
`•
`
`•
`
`•
`
`•
`
`Regeneron Exhibit 2010
`Page 04 of 22
`
`
`
`42
`CHAPTER 3 Protein Structure and Function
`
`r ffiiie:Ili:f4
`• S r. 4111%. ±. 0/
`g :le
`*11 :441.01.11!:ii
`
`*.i: 7 : 4 1;V:14 : 41 •i
`
`; 400: 4.14 : triV4
`
`:V"T:1:?*-*
`
`"
`
`•
`• •
`• <
`
`•
`
`•
`
`•
`
`FIGURE 3.2 A complex protein
`assembly. An electron micrograph of
`insect flight tissue in cross section shows
`a hexagonal array of two kinds of protein
`filaments. [Courtesy of Dr. Michael Reedy.]
`
`FIGURE 3.3 Flexibility and
`function. Upon binding iron, the protein
`lactoferrin undergoes conformational
`changes that allow other molecules to
`distinguish between the iron-free and the
`iron-bound forms.
`
`/W'
`
`'70
`
`DNA
`
`44
`
`•
`
`.
`....
`‘fr /gm) 06,a 4
`
`3.1 PROTEINS ARE BUILT FROM A REPERTOIRE
`OF 20 AMINO ACIDS
`
`Amino acids are the building blocks of proteins. An a-amino acid consists
`of a central carbon atom, called the a carbon, linked to an amino group, a
`carboxylic acid group, a hydrogen atom, and a distinctive R group. The R
`group is often referred to as the side chain. With four different groups con-
`nected to the tetrahedral a-carbon atom, a-amino acids are chiral; the two
`mirror-image forms are called the L isomer and the D isomer (Figure 3.4).
`
`43
`A Repertoire of 20 Amino Acids
`
`Notation for distinguishing stereoisomers—
`The four different substituents of an
`asymmetric carbon atom are assigned
`a priority according to atomic number.
`The lowest-priority substituent, often
`hydrogen, is pointed away from the
`viewer. The configuration about the
`carbon is called S, from the Latin sinis-
`ter for "left," if the progression from
`the highest to the lowest priority is
`counterclockwise. The configuration is
`called R, from the Latin rectus for
`"right," if the progression is clockwise.
`
`FIGURE 3.1 Structure dictates function. A protein component of the DNA
`replication machinery surrounds a section of DNA double helix. The structure of the protein
`allows large segments of DNA to be copied without the replication machinery dissociating
`from the DNA.
`
`NH
`
`acids, carboxamides, and a variety of basic groups. When combined in vari-
`ous sequences, this array of functional groups accounts for the broad spec-
`trum of protein function. For instance, the chemical reactivity associated with
`these groups is essential to the function of enzymes, the proteins that catalyze
`specific chemical reactions in biological systems (see Chapters 8-10).
`3. Proteins can interact with one another and with other biological macro-
`molecules to form complex assemblies. The proteins within these assemblies
`can act synergistically to generate capabilities not afforded by the individ-
`ual component proteins (Figure 3.2). These assemblies include macro-
`molecular machines that carry out the accurate replication of DNA, the
`transmission of signals within cells, and many other essential processes.
`4. Some proteins are quite rigid, whereas others display limited flexibility.
`Rigid units can function as structural elements in the cytoskeleton (the in-
`ternal scaffolding within cells) or in connective tissue. Parts of proteins with
`limited flexibility may act as hinges, springs, and levers that are crucial to
`protein function, to the assembly of proteins with one another and with
`other molecules into complex units, and to the transmission of information
`within and between cells (Figure 3.3).
`
`L isomer
`
`o isomer
`
`FIGURE 3.4 The L and o isomers of amino acids. R refers to the side chain.
`The L and o isomers are mirror images of each other.
`
`Only L amino acids are constituents of proteins. For almost all amino acids,
`the L isomer has S (rather than R) absolute configuration (Figure 3.5). Al-
`though considerable effort has gone into understanding why amino acids in
`proteins have this absolute configuration, no satisfactory explanation has
`been arrived at. It seems plausible that the selection of L over D was arbi-
`trary but, once made, was fixed early in evolutionary history.
`Amino acids in solution at neutral pH exist predominantly as dipolar
`ions (also called zwitterions). In the dipolar form, the amino group is proton-
`ated (--NH3+) and the carboxyl group is deprotonated (—000—). The ion-
`ization state of an amino acid varies with pH (Figure 3.6). In acid solution
`(e.g., pH 1), the amino group is protonated (—NH3+) and the carboxyl group
`is not dissociated (—COOH). As the pH is raised, the carboxylic acid is the
`first group to give up a proton, inasmuch as its pKa is near 2. The dipolar
`form persists until the pH approaches 9, when the protonated amino group
`
`-1-H3N
`
`COOH
`
`1-1'
`
`F1'
`
`+' N
`
`COO-
`
`H2N
`
`COO-
`
`(3)
`
`H (4)
`
`(1)
`
`FIGURE 3.5 Only L amino acids are
`found in proteins. Almost all L amino
`acids have an S absolute configuration
`(from the Latin sinister meaning "left").
`The counterclockwise direction of the
`arrow from highest- to lowest-priority
`substituents indicates that the chiral
`center is of the S configuration.
`
`Iron
`
`1
`
`0
`
`..173
`
`0
`
`Zwitterionic form
`
`Both groups
`deprotonated
`
`Both groups
`protonated
`
`10
`
`12
`
`14
`
`pH
`
`FIGURE 3.6 Ionization state as a
`function of pH. The ionization state of
`amino acids is altered by a change in pH.
`The zwitterionic form predominates near
`physiological pH.
`
`Regeneron Exhibit 2010
`Page 05 of 22
`
`
`
`45
`A Repertoire of 20 Amino Acids
`
`44
`CHAPTER 3 • Protein Structure and Function
`
`Glycine
`(Gly, G)
`
`Alanine
`(Ala, A)
`
`, 11111.1
`
`FIGURE 3.7 Structures of glycine and
`alanine. (Top) Ball-and-stick models
`show the arrangement of atoms and
`bonds in space. (Middle) Stereochemically
`realistic formulas show the geometrical
`arrangement of bonds around atoms (see
`Chapter 1 Appendix). (Bottom) Fischer
`projections show all bonds as being
`perpendicular for a simplified
`representation (see Chapter 1 Appendix).
`
`H H
`
`CH3
`
`+H3N
`
`C00-
`
`+I-13N
`
`C00-
`
`+H3N--COO -
`
`+H3N--000
`
`-
`
`CH3
`
`H
`Glycine
`(Gly, G)
`
`Alanine
`(Ala, A)
`
`Valine
`(Val, V)
`
`Leucine
`(Leu, L)
`
`Isoleucine
`(Ile, I)
`
`Methionine
`(Met M)
`
`loses a proton. For a review of acid-base concepts and pH, see the appendix
`to this chapter.
`Twenty kinds of side chains varying in size, shape, charge, hydrogen-
`bonding capacity, hydrophobic character, and chemical reactivity are com-
`monly found in proteins. Indeed, all proteins in all species—bacterial, ar-
`chaeal, and eukaryotic—are constructed from the same set of 20 amino
`acids. This fundamental alphabet of proteins is several billion years old. The
`remarkable range of functions mediated by proteins results from the diver-
`sity and versatility of these 20 building blocks. Understanding how this al-
`phabet is used to create the intricate three-dimensional structures that en-
`able proteins to carry out so many biological processes is an exciting area of
`biochemistry and one that we will return to in Section 3.6.
`Let us look at this set of amino acids. The simplest one is glycine, which
`has just a hydrogen atom as its side chain. With two hydrogen atoms
`bonded to the a-carbon atom, glycine is unique in being achiral. Alanine,
`the next simplest amino acid, has a methyl group (-CH3) as its side chain
`(Figure 3.7).
`leucine, and
`Larger hydrocarbon side chains are found in valine,
`isoleucine (Figure 3.8). Methionine contains a largely aliphatic side chain that
`includes a thioether (-S-) group. The side chain of isoleucine includes an
`additional chiral center; only the isomer shown in Figure 3.8 is found in
`proteins. The larger aliphatic side chains are hydrophobic—that is, they tend
`to cluster together rather than contact water. The three-dimensional struc-
`tures of water-soluble proteins are stabilized by this tendency of hy-
`drophobic groups to come together, called the hydrophobic effect (see Sec-
`tion 1.3.4). The different sizes and shapes of these hydrocarbon side chains
`enable them to pack together to form compact structures with few holes.
`Proline also has an aliphatic side chain, but it differs from other members
`of the set of 20 in that its side chain is bonded to both the nitrogen and the
`a-carbon atoms (Figure 3.9). Proline markedly influences protein architec-
`ture because its ring structure makes it more conformationally restricted
`than the other amino acids.
`
`H3C \
`
`CH
`
`CH3
`
`HC
`
`H
`
`CH3
`H2 C\ C \ *
`
`H3C
`
`H2C
`
`CH2
`
`+H3N
`
`C00-
`
`+H3N
`
`C00-
`
`+H3N
`
`C00-
`
`+H3N
`
`C00-
`
`CH3
`
`H-1—CH3
`
`CH3
`
`HI—CH 3
`
`CH2
`
`CH3
`
`CH2
`
`HI—CH 3
`
`CH3
`
`CH2
`
`+H3N-1—COO-
`
`+H3N1 —COO-
`
`+H3N--000
`
`-
`
`+H3N1—000 -
`
`Valine
`(Val, V)
`
`Leucine
`(Leu, L)
`
`H
`lsoleucine
`(Ile, I)
`
`H
`Methionine
`(Met M)
`
`FIGURE 3.8 Amino acids with aliphatic side chains. The additional chiral center
`of isoleucine is indicated by an asterisk.
`
`H2
`H2C/ ---
`/C
`CH2
`H
`
`H2
`
`H2C
`
`CH2
`
`C00-
`
`N+
`H2
`Proline
`(Pro, P)
`
`Nr—C—000-
`H2
`
`H
`
`FIGURE 3.9 Cyclic structure of proline.
`The side chain is joined to both the of
`carbon and the amino group.
`
`Three amino acids with relatively simple aromatic side chains are part of
`the fundamental repertoire (Figure 3.10). Phenylalanine, as its name indi-
`cates, contains a phenyl ring attached in place of one of the hydrogens of
`alanine. The aromatic ring of tyrosine contains a hydroxyl group. This hy-
`droxyl group is reactive, in contrast with the rather inert side chains of the
`other amino acids discussed thus far. Tryptophan has an indole ring joined
`to a methylene (-CH2-) group; the indole group comprises two fused rings
`and an NH group. Phenylalanine is purely hydrophobic, whereas tyrosine
`and tryptophan are less so because of their hydroxyl and NH groups. The
`aromatic rings of tryptophan and tyrosine contain delocalized 7r electrons
`that strongly absorb ultraviolet light (Figure 3.11).
`
`.. A compound's extinction coefficient indicates its ability to absorb light.
`
`Beer's law gives the absorbance (A) of light at a given wavelength:
`
`Regeneron Exhibit 2010
`Page 06 of 22
`
`
`
`CHAPTER 3 • Protein Structure and Function
`
`Phenylalanine
`(Phe, F)
`
`Tyrosine
`(Tyr, Y)
`
`Tryptophan
`CrrP,
`
`Serine
`(Set, 5)
`
`Threonine
`(Thr, T)
`
`47
`A Repertoire of 20 Amino Acids
`
`11-'0\
`
`CH2
`
`H' 0`
`
`4> I
`C
`
`3
`
`-,H3N
`
`C00-
`
`+1-13N
`
`COO-
`
`OH
`I
`
`OH
`
`HI—CH 3
`
`4-H3N--COO -
`
`4-H3N-
`
`1 — 000-
`
`H
`Serine
`(Set, 5)
`
`H
`Threonine
`(Thr, T)
`
`FIGURE 3.12 Amino acids containing
`aliphatic hydroxyl groups. Serine and
`threonine contain hydroxyl groups that
`render them hydrophilic. The additional
`chiral center in threonine is indicated by
`an asterisk.
`
`the concentration of a protein in solution if the number of tryptophan and
`tyrosine residues in the protein is known.
`Two amino acids, serine and threonine, contain aliphatic hydroxyl groups
`(Figure 3.12). Serine can be thought of as a hydroxylated version of alanine,
`whereas threonine resembles valine with a hydroxyl group in place of one
`of the valine methyl groups. The hydroxyl groups on serine and threonine
`make them much more hydrophilic (water loving) and reactive than alanine
`and valine. Threonine, like isoleucine, contains an additional asymmetric
`center; again only one isomer is present in proteins.
`Cysteine is structurally similar to serine but contains a sulfhydryl, or thiol
`(-SH), group in place of the hydroxyl (-OH) group (Figure 3.13). The
`sulfhydryl group is much more reactive. Pairs of sulfhydryl groups may
`come together to form disulfide bonds, which are particularly important in
`stabilizing some proteins, as will be discussed shortly.
`
`SH
`
`+H3N
`
`—
`
` i —COO
`
`H
`
`CH2
`
`H
`
`COO-
`4-H3N
`Cysteine
`(Cys, C)
`
`FIGURE 3.13 Structure of cysteine.
`
`We turn now to amino acids with very polar side chains that render them
`highly hydrophilic. Lysine and arginine have relatively long side chains that
`terminate with groups that are positively charged at neutral pH. Lysine is
`capped by a primary amino group and arginine by a guanidinium group.
`Histidine contains an imidazole group, an aromatic ring that also can be pos-
`itively charged (Figure 3.14).
`
`4 • N
`
`CH2
`
`CH2
`
`CH2
`
`+H3N
`
`COO-
`
`+H3N
`
`COO-
`
`+H3N
`
`COO-
`
`H
`C
`
`CH
`
`HC
`
`CH2
`
`H
`+H3N— C— 000-
`
`HO~C/C\CH
`M
`
`HCs.z.,„
`
`CH2
`
`CH
`c
`H
`
`C
`
`HC
`
`HN
`
`C
`
`C
`
`+H3N— C— 000-
`
`+H3N—C—000
`
`Phenylalanine
`(Phe, F)
`
`Tyrosine
`(TYG
`
`H
`Tryptophan
`(WI), MO
`
`A = ec/
`Beer's law
`where e is the extinction coefficient [in units that are the
`reciprocals of molarity and distance in centimeters (M-
`cm-1)], c is the concentration of the absorbing species (in
`units of molarity, M), and 1 is the length through which
`the light passes (in units of centimeters). For tryptophan,
`absorption is maximum at 280 nm and the extinction
`coefficient is 3400 M-1 cm-1 whereas, for tyrosine, ab-
`sorption is maximum at 276 nm and the extinction coef-
`ficient is a less-intense 1400 M-1 cm-1. Phenylalanine
`absorbs light less strongly and at shorter wavelengths.
`The absorption of light at 280 nm can be used to estimate
`
`FIGURE 3.10 Amino acids with
`aromatic side chains. Phenylalanine,
`tyrosine, and tryptophan have hydrophobic
`character. Tyrosine and tryptophan also
`have hydrophilic properties because of
`their -OH and -NH- groups, respectively.
`
`Trp
`
`Tyr
`
`10,000
`
`8,000
`
`6,000
`
`4,000
`
`2,000
`
`Extinction coefficient (M-1 cm 1)
`
`0
`220
`
`240
`
`260
`280
`Wavelength (nm)
`
`300
`
`320
`
`FIGURE 3.11 Absorption spectra of the aromatic amino acids
`tryptophan (red) and tyrosine (blue). Only these amino acids
`absorb strongly near 280 nm. [Courtesy of Greg Gatto].
`
`f
`
`Regeneron Exhibit 2010
`Page 07 of 22
`
`
`
`48 I---
`CHAPTER 3 • Protein Structure and Function
`
`Lysine
`(Lys, K)
`
`Arginine
`(Arg, R)
`
`Histidine
`(His, H)
`
`Aspartate
`(Asp, D)
`
`Glutamate
`(Glu, E)
`
`Asparagine
`(Asn, N)
`
`Glutamine
`(Gln, Q)
`
`H2C\
`
`NH3+
`
`/ CH2
`
`H2C\
`
`H
`
`H2N, +
`
`HNC
`/ CH2
`
`H2C\
`
`H.
`
`CH2
`
`*H3N
`
`COO-
`
`+H N
`
`COO-
`
`+H EN
`
`COO-
`
`0
`./
`
`CH2
`
`0
`
`H2C\
`
`H
`
`CH2
`
`NH2
`0=C
`
`'CH2
`
`H2N
`
`/ C=0
`H2C\
`
`%A-12
`
`H
`
`H3N
`
`C00-
`
`+H3N
`
`COO-
`
`+H3N
`
`C00-
`
`+H3N
`
`COO-
`
`0,
`
`.0
`
`
`.S..0, C- 0
`..'
`':;;'
`
`i
`
`H2
`
`0.,..... __,NH2
`C
`I
`CH2
`
`0... , NH2
`C
`
`CH2
`I
`CH2
`
`1‘11H3+
`
`CH2
`
`H2N-
`
`
`
`H--N 2
`
`I H
`
`CH2
`
`CH2
`
`FIGURE 3.14 The basic amino acids
`lysine, arginine, and histidine.
`
`+H3N-1—COO-
`
`-1-H3N-1--COO-
`
`H
`Lysine
`(Lys, K)
`
`H
`Arginine
`(Arg, R)
`
`H
`N,
`
`CH
`
`
`
`HC, \ N
`
`CN
`
`CH2
`
`Histidine
`(His, H)
`
`NH2
`
`H2N -
`
`'NH2
`
`Guanidinium
`
`H "N
`
`--11
`
`//
`,N—C‘
`Fi
`H
`Imidazole
`
`op
`
`HC--
`+
`
`\
`
`CH
`
`H'
`
`C
`
`CH2
`11c/
`
`H
`
`0
`
`II
`0
`
`FIGURE 3.15 Histidine ionization.
`Histidine can bind or release protons near
`physiological pH.
`
`With a pKa value near 6, the imidazole group can be uncharged or posi-
`tively charged near neutral pH, depending on its local environment (Figure
`3.15). Indeed, histidine is often found in the active sites of enzymes, where
`the imidazole ring can bind and release protons in the course of enzymatic
`reactions.
`The set of amino acids also contains two with acidic side chains: aspar-
`tic acid and glutamic acid (Figure 3.16). These amino acids are often called
`aspartate and glutamate to emphasize that their side chains are usually neg-
`atively charged at physiological pH. Nonetheless, in some proteins these
`side chains do accept protons, and this ability is often functionally impor-
`tant. In addition, the set includes uncharged derivatives of aspartate and
`
`CH2
`
`C11-12
`
`+H3N—Ci —000-
`
`+H3N1 —C C00-
`
`+H3N — —000-
`
`+H3N— —COO-
`
`Aspartate
`(Asp, D)
`
`N
`Glutamate
`(Glu, E)
`
`H
`Asparagine
`(Asn, N)
`
`H
`Glutamine
`(GIn, Q)
`
`FIGURE 3.16 Amino acids with side-chain carboxylates and carboxamides.
`
`glutamate asparagine and glutamine-- each of which contains a terminal
`carboxamide in place of a carboxylic acid (Figure 3.16).
`Seven of the 20 amino acids have readily ionizable side chains. These
`7 amino acids are able to donate or accept protons to facilitate reactions
`as well as to form ionic bonds. Table 3.1 gives equilibria and typical pKa
`values for ionization of the side chains of tyrosine, cysteine, arginine, ly-
`sine, histidine, and aspartic and glutamic acids in proteins. Two other
`groups in proteins—the terminal a-amino group and the terminal a-
`carboxyl group—can be ionized, and typical pKa values are also included
`in Table 3.1.
`Amino acids are often designated by either a three-letter abbreviation or
`a one-letter symbol (Table 3.2). The abbreviations for amino acids are the
`first three letters of their names, except for asparagine (Asn), glutamine
`(Gin), isoleucine (Ile), and tryptophan (Trp). The symbols for many amino
`acids are the first letters of their names (e.g., G for glycine and L for leucine);
`the other symbols have been agreed on by convention. These abbreviations
`and symbols are an integral part of the vocabulary of biochemists.
`
`y• How did this particular set of amino acids become the building blocks
`of proteins? First, as a set, they are diverse; their structural and chem-
`ical properties span a wide range, endowing proteins with the versatility to
`assume many functional roles. Second, as noted in Section 2.1.1, many of
`these amino acids were probably available from prebiotic reactions. Finally,
`excessive intrinsic reactivity may have eliminated other possible amino
`
`Regeneron Exhibit 2010
`Page 08 of 22
`
`
`
`CHAPTER 3 Protein Structure and Function
`
`TABLE 3.1 Typical pKa values of ionizable groups in proteins
`
`Group
`
`Terminal a-carboxyl group
`
`Acid
`
`0
`II
`
`Aspartic acid
`Glutamic acid
`
`Histidine
`
`Terminal a-amino group
`
`Cysteine
`
`Tyrosine
`
`Lysine
`
`Arginine
`
`H
`
`+H
`H
`H
`,H
`—S
`
`\ /
`+,H
`\ H
`—N.„
`H
`H
`--11
`
`H +
`N=C
`/
`
`H ,N-H
`
`—
`
`Base
`O
`-
`C o
`
`9
`
`N
`
`0
`
`\H
`
`—5-
`
`\ /
`
`—N‘.,
`H
`H
`
`; NI
`H\
`N—C
`\
`H
`,N-H
`
`/
`
`Typical pKa.
`
`3.1
`
`4.1
`
`6.0
`
`8.0
`
`8.3
`
`10.9
`
`10.8
`
`12.5
`
`I
`
`'pKa values depend on temperature, ionic strength, and the microenvironment of the
`ionizable group.
`
`I
`
`ABLE 3.2 Abbreviations for amino acids
`
`Amino acid
`
`Alanine
`Arginine
`Asparagine
`Aspartic Acid
`Cysteine
`Glutamine
`Glutamic Acid
`Glycine
`Histidine
`Isoleucine
`Leucine
`Lysine
`
`Three-letter
`abbreviation
`
`One-letter
`abbreviation Amino acid
`
`Three-letter
`abbreviation
`
`One-letter
`abbreviation
`
`Ala
`Arg
`Asn
`Asp
`Cys
`Gln
`Glu
`Gly
`His
`Ile
`Leu
`Lys
`
`A
`R
`N
`D
`C
`Q
`E
`G
`H
`I
`L
`K
`
`Methionine
`Phenylalanine
`Proline
`Serine
`Threonine
`Tryptophan
`Tyrosine
`Valine
`Asparagine or
`aspartic acid
`Glutamine or
`glutamic acid
`
`Met
`Phe
`Pro
`Ser
`Thr
`Trp
`Tyr
`Val
`Asx
`
`Glx
`
`M
`F
`P
`S
`T
`W
`Y
`V
`B
`
`Z
`
`acids. For example, amino acids such as homoserine and homocysteine tend
`to form five-membered cyclic forms that limit their use in proteins; the al-
`ternative amino acids that are found in proteins—serine and cysteine—do
`not readily cyclize, because the rings in their cyclic forms are too small
`
`(Figure 3.17). 1
`
`u
`H2
`"2 C
`C
`/
`
`H,
`
`H
`
`II
`
`Homoserine
`
`H
`H2
`/
`C-0
`
`X
`
`H
`t
`
`H
`
`II
`0
`Serine
`
`H2 H2
`
`41
`
`\
`
`H
`
`II
`
`HX
`
`+ HX
`
`51
`Primary Structure
`
`FIGURE 3.17 Undesirable reactivity in
`amino acids. Some amino acids are
`unsuitable for proteins because of
`undesirable cyclization. Homoserine can
`cyclize to form a stable, five-membered
`ring, potentially resulting in peptide-bond
`cleavage. Cyclization of serine would form
`a strained, four-membered ring and thus
`is unfavored. X can be an amino group
`from a neighboring amino acid or another
`potential leaving group.
`
`3.2 PRIMARY STRUCTURE: AMINO ACIDS ARE LINKED
`BY PEPTIDE BONDS TO FORM POLYPEPTIDE CHAINS
`
`Proteins are linear polymers formed by linking the a-carboxyl group of one
`amino acid to the a-amino group of another amino acid with a peptide bond
`(also called an amide bond). The formation of a dipeptide from two amino acids
`is accompanied by the loss of a water molecule (Figure 3.18). The equilibrium
`of this reaction lies on the side of hydrolysis rather than synthesis. Hence,
`the biosynthesis of peptide bonds requires an input of free energy. Nonethe-
`less, peptide bonds are quite stable kinetically; the lifetime of a peptide bond
`in aqueous solution in the absence of a catalyst approaches 1000 years.
`
`1-H3N
`
`, 0
`8
`
`H/2
`t
`0
`e-
`5
`0
`
`,H3N
`
`0
`
`t ,
`H "-R2
`
`0
`
`Peptide bond
`
`+ H2O
`
`FIGURE 3.18 Peptide-bond formation.
`The linking of two amino acids is
`accompanied by the loss of a molecule
`of water.
`
`A series of amino acids joined by peptide bonds form a polypeptide chain,
`and each amino acid unit in a polypeptide is called a residue. A polypeptide
`chain has polarity because its ends are different, with an a-amino group at
`one end and an a-carboxyl group at the other. By convention, the amino end
`is taken to be the beginning of a polypeptide chain, and so the sequence of
`amino acids in a polypeptide chain is written starting with the amino-
`terminal residue. Thus, in the pentapeptide Tyr-Gly-Gly-Phe-Leu (YGGFL),
`tyrosine is the amino-terminal (N-terminal) residue and leucine is the car-
`boxyl-terminal (C-terminal) residue (Figure 3.19). Leu-Phe-Gly-Gly-Tyr
`(LFGGY) is a different pentapeptide, with different chemical properties.
`A polypeptide chain consists of a regularly repeating part, called the main
`chain or backbone, and a variable part, comprising the distinctive side chains
`(Figure 3.20). The polypeptide backbone is rich in hydrogen-bonding po-
`tential. Each residue contains a carbonyl group, which is a good hydrogen-
`bond acceptor and, with the exception of proline, an NH group, which is a
`
`Regeneron Exhibit 2010
`Page 09 of 22
`
`
`
`called cystine. Extracellular proteins often have several disulfide bonds,
`whereas intracellular proteins usually lack them. Rarely, nondisulfide cross-
`links derived from other side chains are present in some proteins. For ex-
`ample, collagen fibers in connective tissue are strengthened in this way, as
`are fibrin blood clots.
`
`3.2.1 Proteins Have Unique Amino Acid Sequences
`That Are Specified by Genes
`In 1953, Frederick Sanger determined the amino acid sequence of insulin,
`a protein hormone (Figure 3.22). This work is a landmark in biochemistry be-
`cause it showed for the first time that a protein has a precisely defined amino
`acid sequence. Moreover, it demonstrated that insulin consists only of L
`amino acids linked by peptide bonds between a -amino and a -carboxyl
`groups. This accomplishment stimulated other scientists to carry out se-
`quence studies of a wide variety of proteins. Indeed, the complete amino
`acid sequences of more than 100,000 proteins are now known. The striking
`fact is that each protein has a unique, precisely defined amino acid sequence.
`The amino acid sequence of a protein is often referred to as its primary
`structure.
`A series of incisive studies in the late 1950s and early 1960s revealed that
`the amino acid sequences of proteins are genetically determined. The se-
`quence of nucleotides in DNA, the molecule of heredity, specifies a com-
`plementary sequence of nucleotides in RNA, which in turn specifies the
`amino acid sequence of a protein. In particular, each of the 20 amino acids
`of the repertoire is encoded by one or more specific sequences of three nu-
`cleotides (Section 5.5).
`Knowing amino acid sequences is important for several reasons. First,
`knowledge of the sequence of a protein is usually essential to elucidating its
`mechanism of action (e.g., the catalytic mechanism of an enzyme). More-
`over, proteins with novel properties can be generated by varying the se-
`quence of known proteins. Second, amino acid sequences determine the
`three-dimensional structures of proteins. Amino acid sequence is the link
`between the genetic message in DNA and the three-dimensional structure
`that performs a protein's biological function. Analyses of relations between
`amino acid sequences and three-dimensional structures of proteins are un-
`covering the rules that govern the folding of polypeptide chains. Third, se-
`quence determination is a component of molecular pathology, a rapidly
`growing area of medicine. Alterations in amino acid sequence can produce
`abnormal function and disease. Severe and sometimes fatal diseases, such
`as sickle-cell anemia and cystic fibrosis, can result from a change in a sin-
`gle amino acid within a protein. Fourth, the sequence of a protein reveals
`much about its evolutionary history (see Chapter 7). Proteins resemble one
`another in amino acid sequence only if they have a common ancestor. Con-
`sequently, molecular events in evolution can be traced from amino acid se-
`quences; molecular paleontology is a flourishing area of research.
`
` I 53 1—
`Primary Structure
`
`FIGURE 3.22 Amino acid sequence
`of bovine insulin.
`
`S
`
`-S
`A chain
`1
`I
`Gly-Ile-Val-Glu-Gln-Cys-Cys-Ala-Ser-Val-Cys-Ser-Leu-Tyr-Gln-Leu-Glu-Asn-Tyr-Cys-Asn
`5
`15
`10
`I 21
`S
`
`S
`I
`S
`S/
`B chain
`Phe-Val-Asn-Gln-His-Leu-Cys-Gly-Ser-His-Leu-Val-Glu-Ala-Leu-Tyr-Leu-Val-Cys-Gly-Glu-Arg-Gly-Phe-Phe-Tyr-Thr-Pro-Lys-Ala
`5
`15
`10
`20
`25
`30
`
`— I 52
`CHAPTER 3 • Protein Structure and Function
`
`OH
`
`pi3
`/".--CH3
`HC
`
`H2C H
`
`+H3N-
`
`ICI
`
`H
`
`0
`11
`
`FL )1
`
`H
`
`0 H2C
`VP
`II
`
`C
`1
`II
`0 H2c
`
`C
`
`Tyr
`Amino
`terminal residue
`
`Gly
`
`Gly
`
`Phe
`
`Leu
`Carboxyl
`terminal residue
`
`R1
`H. /
`
`0
`
`R3
`11, I
`
`H
`
`H 1%5
`r
`
`OII
`C
`
`H
`
`0
`
`?‘
`2 H
`R
`
`H
`
`0
`
`H
`
`R4
`
`H
`
`0
`
`good hydrogen-bond donor. These groups interact with each other and with
`functional groups from side chains to stabilize particular structures, a