`Lodish H, Berk A, Zipursky SL, et al. Molecular Cell Biology. 4th edition. New York: W. H. Freeman; 2000.
`
`Section 3.1 Hierarchical Structure of Proteins
`
`Proteins are designed to bind every conceivable molecule—from simple ions to large complex
`molecules like fats, sugars, nucleic acids, and other proteins. They catalyze an extraordinary
`range of chemical reactions, provide structural rigidity to the cell, control flow of material
`through membranes, regulate the concentrations of metabolites, act as sensors and switches,
`cause motion, and control gene function. The three-dimensional structures of proteins have
`evolved to carry out these functions efficiently and under precise control. The spatial
`organization of proteins, their shape in three dimensions, is a key to understanding how they
`work.
`
`One of the major areas of biological research today is how proteins, constructed from only 20
`different amino acids, carry out the incredible array of diverse tasks that they do. Unlike the
`intricate branched structure of carbohydrates, proteins are single, unbranched chains of amino
`acid monomers. The unique shape of proteins arises from noncovalent interactions between
`regions in the linear sequence of amino acids. Only when a protein is in its correct three-
`dimensional structure, or conformation, is it able to function efficiently. A key concept in
`understanding how proteins work is that function is derived from three-dimensional structure,
`and three-dimensional structure is specified by amino acid sequence.
`
`The Amino Acids Composing Proteins Differ Only in Their Side Chains
`Amino acids are the monomeric building blocks of proteins. The α carbon atom (C ) of amino
`α
`acids, which is adjacent to the carboxyl group, is bonded to four different chemical groups: an
`amino (NH ) group, a carboxyl (COOH) group, a hydrogen (H) atom, and one variable group,
`2
`called a side chain or R group (Figure 3-1). All 20 different amino acids have this same general
`structure, but their side-chain groups vary in size, shape, charge, hydrophobicity, and reactivity.
`
`Figure 3-1
`
`Amino acids, the monomeric units that link together to form
`proteins, have a common structure. The α carbon atom
`(green) of each amino acid is bonded to four different
`chemical groups and thus is asymmetric. The side chain, or R
`group (red), is (more...)
`
`The amino acids can be considered the alphabet in which linear proteins are “written.” Students
`of biology must be familiar with the special properties of each letter of this alphabet, which are
`determined by the side chain. Amino acids can be classified into a few distinct categories based
`primarily on their solubility in water, which is influenced by the polarity of their side chains
`(Figure 3-2). Amino acids with polar side groups tend to be on the surface of proteins; by
`interacting with water, they make proteins soluble in aqueous solutions. In contrast, amino acids
`with nonpolar side groups avoid water and aggregate to form the waterinsoluble core of proteins.
`The polarity of amino acid side chains thus is one of the forces responsible for shaping the final
`
`MYLAN INST. EXHIBIT 1091 PAGE 1
`
`MYLAN INST. EXHIBIT 1091 PAGE 1
`
`
`
`three-dimensional structure of proteins.
`
`Figure 3-2
`
`The structures of the 20 common amino acids grouped into
`three categories: hydrophilic, hydrophobic, and special amino
`acids. The side chain determines the characteristic properties
`of each amino acid. Shown are the zwitterion forms, which
`exist at the (more...)
`
`Hydrophilic, or water-soluble, amino acids have ionized or polar side chains. At neutral pH,
`arginine and lysine are positively charged; aspartic acid and glutamic acid are negatively
`charged and exist as aspartate and glutamate. These four amino acids are the prime contributors
`to the overall charge of a protein. A fifth amino acid, histidine, has an imidazole side chain,
`which has a pK of 6.8, the pH of the cytoplasm. As a result, small shifts of cellular pH will
`a
`change the charge of histidine side chains:
`
`The activities of many proteins are modulated by pH through protonation of histidine side
`chains. Asparagine and glutamine are uncharged but have polar amide groups with extensive
`hydrogen-bonding capacities. Similarly, serine and threonine are uncharged but have polar
`hydroxyl groups, which also participate in hydrogen bonds with other polar molecules. Because
`the charged and polar amino acids are hydrophilic, they are usually found at the surface of a
`water-soluble protein, where they not only contribute to the solubility of the protein in water but
`also form binding sites for charged molecules.
`
`Hydrophobic amino acids have aliphatic side chains, which are insoluble or only slightly soluble
`in water. The side chains of alanine, valine, leucine, isoleucine, and methionine consist entirely
`of hydrocarbons, except for the sulfur atom in methionine, and all are nonpolar. Phenylalanine,
`tyrosine, and tryptophan have large bulky aromatic side groups. As explained in Chapter 2,
`hydrophobic molecules avoid water by coalescing into an oily or waxy droplet. The same forces
`cause hydrophobic amino acids to pack in the interior of proteins, away from the aqueous
`environment. Later in this chapter, we will see in detail how hydrophobic residues line the
`surface of membrane proteins that reside in the hydrophobic environment of the lipid bilayer.
`
`Lastly, cysteine, glycine, and proline exhibit special roles in proteins because of the unique
`properties of their side chains. The side chain of cysteine contains a reactive sulfhydryl group
`(—SH), which can oxidize to form a disulfide bond (—S—S—) to a second cysteine:
`
`MYLAN INST. EXHIBIT 1091 PAGE 2
`
`MYLAN INST. EXHIBIT 1091 PAGE 2
`
`
`
`Regions within a protein chain or in separate chains sometimes are cross-linked covalently
`through disulfide bonds. Although disulfide bonds are rare in intracellular proteins, they are
`commonly found in extracellular proteins, where they help maintain the native, folded structure.
`The smallest amino acid, glycine, has a single hydrogen atom as its R group. Its small size allows
`it to fit into tight spaces. Unlike any of the other common amino acids, proline has a cyclic ring
`that is produced by formation of a covalent bond between its R group and the amino group on
`C . Proline is very rigid, and its presence creates a fixed kink in a protein chain. Proline and
`α
`glycine are sometimes found at points on a protein’s surface where the chain loops back into the
`protein.
`
`The 6225 known and predicted proteins encoded by the yeast genome have an average molecular
`weight (MW) of 52,728 and contain, on average, 466 amino acid residues. Assuming that these
`average values represent a “typical” eukaryotic protein, then the average molecular weight of
`amino acids is 113, taking their average relative abundance in proteins into account. This is a
`useful number to remember, as we can use it to estimate the number of residues from the
`molecular weight of a protein or vice versa. Some amino acids are more abundant in proteins
`than other amino acids. Cysteine, tryptophan, and methionine are rare amino acids; together they
`constitute approximately 5 percent of the amino acids in a protein. Four amino acids—leucine,
`serine, lysine, and glutamic acid—are the most abundant amino acids, totaling 32 percent of all
`the amino acid residues in a typical protein. However, the amino acid composition of proteins
`can vary widely from these values. For example, as discussed in later sections, proteins that
`reside in the lipid bilayer are enriched in hydrophobic amino acids.
`
`Peptide Bonds Connect Amino Acids into Linear Chains
`Nature has evolved a single chemical linkage, the peptide bond, to connect amino acids into a
`linear, unbranched chain. The peptide bond is formed by a condensation reaction between the
`amino group of one amino acid and the carboxyl group of another (Figure 3-3a). The repeated
`amide N, C , and carbonyl C atoms of each amino acid residue form the backbone of a protein
`α
`molecule from which the various side-chain groups project. As a consequence of the peptide
`linkage, the backbone has polarity, since all the amino groups lie to the same side of the C
`α
`atoms. This leaves at opposite ends of the chain a free (unlinked) amino group (the N-terminus)
`and a free carboxyl group (the C-terminus). A protein chain is conventionally depicted with its
`N-terminal amino acid on the left and its C-terminal amino acid on the right (Figure 3-3b).
`
`Figure 3-3
`
`MYLAN INST. EXHIBIT 1091 PAGE 3
`
`MYLAN INST. EXHIBIT 1091 PAGE 3
`
`
`
`The peptide bond. (a) A condensation reaction between two amino acids forms
`the peptide bond, which links all the adjacent residues in a protein chain. (b)
`Side-chain groups (R) extend from the backbone of a protein chain, in which
`the amino N, α (more...)
`
`Many terms are used to denote the chains formed by polymerization of amino acids. A short
`chain of amino acids linked by peptide bonds and having a defined sequence is a peptide; longer
`peptides are referred to as polypeptides. Peptides generally contain fewer than 20–30 amino acid
`residues, whereas polypeptides contain as many as 4000 residues. We reserve the term protein
`for a polypeptide (or a complex of polypeptides) that has a threedimensional structure. It is
`implied that proteins and peptides represent natural products of a cell.
`
`The size of a protein or a polypeptide is reported as its mass in daltons (a dalton is 1 atomic mass
`unit) or as its molecular weight (a dimensionless number). For example, a 10,000-MW protein
`has a mass of 10,000 daltons (Da), or 10 kilodaltons (kDa). In the last section of this chapter, we
`will discuss different methods for measuring the sizes and other physical characteristics of
`proteins.
`
`Four Levels of Structure Determine the Shape of Proteins
`The structure of proteins commonly is described in terms of four hierarchical levels of
`organization. These levels are illustrated in Figure 3-4, which depicts the structure of
`hemagglutinin, a surface protein on the influenza virus. This protein binds to the surface of
`animal cells, including human cells, and is responsible for the infectivity of the flu virus.
`
`Figure 3-4
`
`Four levels of structure in hemagglutinin, which is a long
`multimeric molecule whose three identical subunits are each
`composed of two chains, HA and HA . (a) Primary structure
`1
`2
`is illustrated by the amino acid sequence of residues 68 –195
`(more...)
`
`The primary structure of a protein is the linear arrangement, or sequence, of amino acid residues
`that constitute the polypeptide chain.
`
`Secondary structure refers to the localized organization of parts of a polypeptide chain, which
`can assume several different spatial arrangements. A single polypeptide may exhibit all types of
`secondary structure. Without any stabilizing interactions, a polypeptide assumes a random-coil
`structure. However, when stabilizing hydrogen bonds form between certain residues, the
`backbone folds periodically into one of two geometric arrangements: an α helix, which is a
`spiral, rodlike structure, or a β sheet, a planar structure composed of alignments of two or more β
`strands, which are relatively short, fully extended segments of the backbone. Finally, U-shaped
`four-residue segments stabilized by hydrogen bonds between their arms are called turns. They
`are located at the surfaces of proteins and redirect the polypeptide chain toward the interior.
`(These structures will be discussed in greater detail later.)
`
`Tertiary structure, the next-higher level of structure, refers to the overall conformation of a
`
`MYLAN INST. EXHIBIT 1091 PAGE 4
`
`MYLAN INST. EXHIBIT 1091 PAGE 4
`
`
`
`polypeptide chain, that is, the three-dimensional arrangement of all the amino acids residues. In
`contrast to secondary structure, which is stabilized by hydrogen bonds, tertiary structure is
`stabilized by hydrophobic interactions between the nonpolar side chains and, in some proteins,
`by disulfide bonds. These stabilizing forces hold the α helices, β strands, turns, and random coils
`in a compact internal scaffold. Thus, a protein’s size and shape is dependent not only on its
`sequence but also on the number, size, and arrangement of its secondary structures. For proteins
`that consist of a single polypeptide chain, monomeric proteins, tertiary structure is the highest
`level of organization.
`
`Multimeric proteins contain two or more polypeptide chains, or subunits, held together by
`noncovalent bonds. Quaternary structure describes the number (stoichiometry) and relative
`positions of the subunits in a multimeric protein. Hemagglutinin is a trimer of three identical
`subunits; other multimeric proteins can be composed of any number of identical or different
`subunits.
`
`In a fashion similar to the hierarchy of structures that make up a protein, proteins themselves are
`part of a hierarchy of cellular structures. Proteins can associate into larger structures termed
`macromolecular assemblies. Examples of such macromolecular assemblies include the protein
`coat of a virus, a bundle of actin filaments, the nuclear pore complex, and other large
`submicroscopic objects. Macromolecular assemblies in turn combine with other cell biopolymers
`like lipids, carbohydrates, and nucleic acids to form complex cell organelles.
`
`Graphic Representations of Proteins Highlight Different Features
`Different ways of depicting proteins convey different types of information. The simplest way to
`represent three-dimensional structure is to trace the course of the backbone atoms with a solid
`line (Figure 3-5a); the most complex model shows the location of every atom (Figure 3-5b; see
`also Figure 2-1a). The former shows the overall organization of the polypeptide chain without
`consideration of the amino acid side chains; the latter details the interactions among atoms that
`form the backbone and that stabilize the protein’s conformation. Even though both views are
`useful, the elements of secondary structure are not easily discerned in them.
`
`Figure 3-5
`
`Various graphic representations of the structure of Ras, a
`guanine nucleotide–binding protein. Guanosine diphosphate,
`the substrate that is bound, is shown as a blue space-filling
`figure in parts (a)–(d). (a) The C trace of Ras, (more...)
`α
`
`Another type of representation uses common shorthand symbols for depicting secondary
`structure, cylinders for α helices, arrows for β strands, and a flexible stringlike form for parts of
`the backbone without any regular structure (Figure 3-5c). This type of representation emphasizes
`the organization of the secondary structure of a protein, and various combinations of secondary
`structures are easily seen.
`
`However, none of these three ways of representing protein structure conveys much information
`about the protein surface, which is of interest because this is where other molecules bind to a
`protein. Computer analysis in which a water molecule is rolled around the surface of a protein
`can identify the atoms that are in contact with the watery environment. On this water-accessible
`
`MYLAN INST. EXHIBIT 1091 PAGE 5
`
`MYLAN INST. EXHIBIT 1091 PAGE 5
`
`
`
`surface, regions having a common chemical (hydrophobicity or hydrophilicity) and electrical
`(basic or acidic) character can be mapped. Such models show the texture of the protein surface
`and the distribution of charge, both of which are important parameters of binding sites (Figure 3-
`5d). This view represents a protein as seen by another molecule.
`
`Secondary Structures Are Crucial Elements of Protein Architecture
`In an average protein, 60 percent of the polypeptide chain exists as two regular secondary
`structures, α helices and β sheets; the remainder of the molecule is in random coils and turns.
`Thus, α helices and β sheets are the major internal supportive elements in proteins. In this
`section, we explore the forces that favor formation of secondary structures. In later sections, we
`examine how these structures can pack into larger arrays.
`
`The α Helix
`Polypeptide segments can assume a regular spiral, or helical, conformation, called the α helix. In
`this secondary structure, the carbonyl oxygen of each peptide bond is hydrogen-bonded to the
`amide hydrogen of the amino acid four residues toward the C-terminus. This uniform
`arrangement of bonds confers a polarity on a helix because all the hydrogen-bond donors have
`the same orientation. The peptide backbone twists into a helix having 3.6 amino acids per turn
`(Figure 3-6). The stable arrangement of amino acids in the α helix holds the backbone as a
`rodlike cylinder from which the side chains point outward. The hydrophobic or hydrophilic
`quality of the helix is determined entirely by the side chains, because the polar groups of the
`peptide backbone are already involved in hydrogen bonding in the helix and thus are unable to
`affect its hydrophobicity or hydrophilicity.
`
`Figure 3-6
`
`Model of the α helix. The polypeptide backbone is folded into
`a spiral that is held in place by hydrogen bonds (black dots)
`between backbone oxygen atoms and hydrogen atoms. Note
`that all the hydrogen bonds have the same polarity. The outer
`surface (more...)
`
`In many α helices hydrophilic side chains extend from one side of the helix and hydrophobic side
`chains from the opposite side, making the overall structure amphipathic. In such helices the
`hydrophobic residues, although apparently randomly arranged, occur in a regular pattern (Figure
`3-7). One way of visualizing this arrangement is to look down the center of an α helix and then
`project the amino acid residues onto the plane of the paper. The residues will appear as a wheel,
`and in the case of an amphipathic helix, the hydrophobic residues all lie on one side of the wheel
`and the hydrophilic ones on the other side.
`
`Figure 3-7
`
`Regions of an α helix may be amphipathic. The five chains of
`cartilage oligomeric matrix protein associate into a coiled-coil
`fibrous domain through amphipathic α helices. Seen in cross
`
`MYLAN INST. EXHIBIT 1091 PAGE 6
`
`MYLAN INST. EXHIBIT 1091 PAGE 6
`
`
`
`section through a part of the domain, the hy-drophobic
`(more...)
`
`Amphipathic α helices are important structural elements in fibrous proteins found in a watery
`environment. In a coiled-coil region of a protein, the hydrophobic surface of the α helix faces
`inward to form the hydrophobic core, and the hydrophilic surfaces face outward toward the
`surrounding fluid. This same orientation of surfaces is also found in most globular proteins. A
`crucial difference is that the hydrophobic interaction could be with a β strand, random coil, or
`another α helix. As we discuss later, amphipathic β strands line the walls of an ion channel in the
`cell membrane.
`
`The β Sheet
`Another regular secondary structure, the β sheet, consists of laterally packed β strands. Each β
`strand is a short (5–8-residue), nearly fully extended polypeptide chain. Hydrogen bonding
`between backbone atoms in adjacent β strands, within either the same or different polypeptide
`chains, forms a β sheet (Figure 3-8a). Like α helices, β strands have a polarity defined by the
`orientation of the peptide bond. Therefore, in a pleated sheet, adjacent β strands can be oriented
`antiparallel or parallel with respect to each other. In both arrangements of the backbone, the side
`chains project from both faces of the sheet (Figure 3-8b).
`
`Figure 3-8
`
`β SHEETS. (a) A simple two-stranded β sheet with
`antiparallel β strands. A sheet is stabilized by hydrogen bonds
`(black dots) between the β strands. The planarity of the
`peptide bond forces a β sheet to be pleated; (more...)
`
`In some proteins, β sheets form the floor of a binding pocket (Figure 3-8c). In many structural
`proteins, multiple layers of pleated sheets provide toughness. Silk fibers, for example, consist
`almost entirely of stacks of antiparallel β sheets. The fibers are flexible because the stacks of β
`sheets can slip over one another. However, they are also resistant to breakage because the peptide
`backbone is aligned parallel with the fiber axis.
`
`Turns
`Composed of three or four residues, turns are compact, U-shaped secondary structures stabilized
`by a hydrogen bond between their end residues. They are located on the surface of a protein,
`forming a sharp bend that redirects the polypeptide backbone back toward the interior. Glycine
`and proline are commonly present in turns. The lack of a large side chain in the case of glycine
`and the presence of a built-in bend in the case of proline allow the polypeptide backbone to fold
`into a tight U-shaped structure. Without turns, a protein would be large, extended, and loosely
`packed. A polypeptide backbone also may contain long bends, or loops. In contrast to turns,
`which exhibit a few defined structures, loops can be formed in many different ways.
`
`Motifs Are Regular Combinations of Secondary Structures
`Many proteins contain one or more motifs built from particular combinations of secondary
`
`MYLAN INST. EXHIBIT 1091 PAGE 7
`
`MYLAN INST. EXHIBIT 1091 PAGE 7
`
`
`
`structures. A motif is defined by a specific combination of secondary structures that has a
`particular topology and is organized into a characteristic three-dimensional structure. Three
`common motifs are depicted in Figure 3-9.
`
`Figure 3-9
`
`Secondary-structure motifs. (a) The coiled-coil motif (left) is
`characterized by two or more helices wound around one
`another. In some DNA-binding proteins, like c-Jun, a two-
`stranded coiled coil is responsible for dimerization (right).
`Each helix in (more...)
`
`The coiled-coil motif comprises two, three, or four amphipathic α helices wrapped around one
`another. In this motif, hydrophobic side chains project like “knobs” from one helix and
`interdigitate into the gaps, or “holes,” between the hydrophobic side chains of the other helix
`along the contact surface. The subunits in some multimeric proteins and in rodlike fibers are held
`2+
`together by coiled-coil interactions. The Ca
`-binding helix-loop-helix motif is marked by the
`presence of certain hydrophilic residues at invariant positions in the loop. Oxygen atoms in the
`invariant residues bind a calcium ion through hydrogen bonds. In another common motif, the
`zinc finger, three secondary structures—an α helix and two β strands with an antiparallel
`orientation—form a fingerlike bundle held together by a zinc ion. This motif is most commonly
`found in proteins that bind RNA or DNA.
`
`Additional motifs will be examined in discussions of other proteins. The presence of the same
`motif in different proteins with similar functions clearly indicates that during evolution these
`useful combinations of secondary structures have been conserved.
`
`Structural and Functional Domains Are Modules of Tertiary Structure
`The tertiary structure of large proteins is often subdivided into distinct globular or fibrous
`regions called domains. Structurally, a domain is a compactly folded region of polypeptide. For
`large proteins, domains can be recognized in structures determined by x-ray crystallography or in
`images captured by electron microscopy. These discrete regions are well distinguished or
`physically separated from other parts of the protein, but connected by the polypeptide chain.
`Hemagglutinin, for example, contains a globular domain and a fibrous domain (see Figure 3-4b).
`
`A structural domain consists of 100–200 residues in various combinations of α helices, β sheets,
`turns, and random coils. Often a domain is characterized by some interesting structural feature,
`for example, an unusual abundance of a particular amino acid (a proline-rich domain, an acidic
`domain, a glycine-rich domain), sequences common to (conserved in) many proteins (SH3, or
`Src homology region 3), or a particular secondary-structure motif (zinc-finger motif in kringle
`domain).
`
`Domains sometimes are defined in functional terms based on observations that the activity of a
`protein is localized to a small region along its length. For instance, a particular region or regions
`of a protein may be responsible for its catalytic activity (e.g., a kinase domain) or binding ability
`(e.g., a DNA-binding domain, membrane-binding domain). Functional domains often are
`identified experimentally by whittling down a protein to its smallest active fragment with the aid
`of proteases, enzymes that cleave the polypeptide backbone. Alternatively, the DNA encoding a
`
`MYLAN INST. EXHIBIT 1091 PAGE 8
`
`MYLAN INST. EXHIBIT 1091 PAGE 8
`
`
`
`protein can be subjected to mutagenesis, so that segments of the protein’s backbone are removed
`or changed (Chapter 7). The activity of the truncated or altered protein product synthesized from
`the mutated gene is then monitored.
`
`The functional definition of a domain is less rigorous than a structural definition. However, if the
`three-dimensional structure of a protein has not been determined, identification of functional
`domains can provide useful information about the protein. Because the activity of a protein
`usually depends on a proper three-dimensional structure, a functional domain consists of at least
`one and often several structural domains.
`
`The organization of tertiary structure into domains further illustrates the principle that complex
`molecules are built from simpler components. Like secondary-structure motifs, tertiary-structure
`domains are incorporated as modules into different proteins, thereby modifying their functional
`activities. The modular approach to protein architecture is particularly easy to recognize in large
`proteins, which tend to be a mosaic of different domains and thus can perform different functions
`simultaneously.
`
`The epidermal growth factor (EGF) domain is one example of a module that is present in several
`proteins (Figure 3-10). EGF is a small soluble peptide hormone that binds to cells in the skin and
`connective tissue, causing them to divide. It is generated by proteolytic cleavage between
`repeated EGF domains in the EGF precursor protein, which is anchored in the cell membrane by
`a membrane-spanning domain. Six conserved cysteine residues form three pairs of disulfide
`bonds that hold EGF in its native conformation. The EGF domain also occurs in other proteins,
`including tissue plasminogen activator (TPA), a protease that is used to dissolve blood clots in
`heart attack victims; Neu protein, which is involved in embryonic differentiation; and Notch
`protein, a cell-adhesion molecule that glues cells together. Besides the EGF domain, these
`proteins contain additional domains found in other proteins. For example, TPA possesses a
`chymotryptic domain, a common feature in proteins that catalyze proteolysis.
`
`Figure 3-10
`
`Schematic diagrams of various proteins, illustrating their
`modular nature. Epidermal growth factor (EGF) is generated
`by proteolytic cleavage of a precursor protein containing
`multiple EGF domains (orange). The EGF domain also
`occurs in Neu protein and (more...)
`
`Sequence Homology Suggests Functional and Evolutionary
`Relationships between Proteins
`Early evidence supporting the key principle that the amino acid sequence of a protein determines
`its three-dimensional structure was obtained in the 1960s by Max Perutz. On comparing the
`structures of myoglobin and hemoglobin determined from x-ray crystallographic analysis, he
`immediately noted that the subunits of hemoglobin, a tetramer of two α and two β subunits,
`resembled myoglobin, a monomer (Figure 3-11). Although the sequences of the two proteins
`were unknown at the time, Perutz proposed that the similar arrangement of α helices in the two
`proteins is a consequence of their having similar amino acid sequences. Later sequencing of
`myoglobin and hemoglobin revealed that many identical or chemically similar residues occur in
`identical positions throughout the sequences of both proteins. The two proteins also exhibit
`
`MYLAN INST. EXHIBIT 1091 PAGE 9
`
`MYLAN INST. EXHIBIT 1091 PAGE 9
`
`
`
`similar functions: myoglobin is the oxygen-carrier protein in muscle, and hemoglobin the
`oxygen-carrier protein in blood. Most of the conserved residues hold the heme group in place or
`are responsible for maintaining the hydrophobic interior of the protein.
`
`Figure 3-11
`
`Models of the tertiary structures of the oxygen-carrier
`proteins myoglobin and hemoglobin based on x-ray
`crystallographic analysis. Note the similarity in the tertiary
`structures of myoglobin and the two α subunits (blue) and
`two β subunits (more...)
`
`As data concerning protein sequences and three- dimensional structures accumulated, the
`concept that similar sequences fold into similar secondary and tertiary structures was confirmed.
`The propensity of each amino acid to occur in the various types of secondary structures has been
`calculated from the amino acid sequence of secondary structures extracted from databases of the
`three-dimensional structures of proteins. This tabulation of the folding information inherent in
`the sequence is now being used in attempts to predict the three-dimensional structure of various
`proteins from their amino acid sequences.
`
`In the classical taxonomy of the eighteenth and nineteenth centuries, organisms were classified
`according to their morphological similarities and differences. In this century, the molecular
`revolution in biology has given birth to “molecular” taxonomy: the classification of proteins
`based on similarities and differences in their amino acid sequences. This new taxonomy provides
`much information about protein function and evolutionary relationships. If the similarity between
`proteins from different organisms is significant over their entire sequence, then the proteins are
`homologs of one another, and they probably carry out similar functions. Sequence similarity also
`suggests an evolutionary relationship between proteins; that is, they evolved from a common
`ancestor. We can therefore describe homologous proteins as belonging to the same “family” and
`can trace their lineage from comparisons of sequences. Closely related proteins have the most
`similar sequences; distantly related proteins have only faintly similar sequences.
`
`The kinship among homologous proteins is most easily visualized from a tree diagram based on
`sequence analyses. For example, the amino acid sequences of hemoglobins from different
`species suggest that they evolved from an ancestral monomeric, oxygen-binding protein (Figure
`3-12). Over time, this ancestral protein slowly changed, giving rise to myoglobin, which
`remained a monomeric protein, and to the α and β subunits, which evolved to associate into the
`tetrameric hemoglobin molecule. As the tree diagram in Figure 3-12 shows, evolution of the
`globin protein family parallels that of the vertebrates.
`
`Figure 3-12
`
`Evolutionary tree showing how the globin protein family
`arose, starting from the most primitive oxygen-binding
`proteins, leghemoglobins, in plants. Sequence comparisons
`have revealed that evolution of the globin proteins parallels
`the evolution of vertebrates. (more...)
`
`MYLAN INST. EXHIBIT 1091 PAGE 10
`
`MYLAN INST. EXHIBIT 1091 PAGE 10
`
`
`
`The power of such comparative analysis and identification of homologous proteins has expanded
`substantially in recent years by use of the base sequences in an organism’s genome to deduce the
`amino acid sequences of the encoded proteins. As discussed in Chapter 7, this approach permits
`“sequencing” of proteins that are difficult to purify in significant amounts.
`
`SUMMARY
`
` A protein is a linear polymer of amino acids linked together by peptide bonds. Various,
`mostly noncovalent, interactions between amino acids in the linear sequence stabilize a
`specific folded three-dimensional structure (conformation) for each protein.
`
` The 20 different amino acids found in natural proteins are conveniently grouped into three
`categories based on the nature of their side (R) groups: hydrophilic amino acids, with a
`charged or polar and uncharged R group; hydrophobic amino acids, with an aliphatic or
`bulky and aromatic R group; and amino acids with a special group, consisting of cysteine,
`glycine, and proline (see Figure 3-2).
`
` The α helix, β strand and sheet, and turn are the most prevalent elements of protein
`secondary structure, which is stabilized by hydrogen bonds between atoms of the peptide
`backbone. Certain combinations of secondary structures give rise to different motifs,
`which are found in a variety of proteins and often are associated with specific functions
`(see Figure 3-9).
`
` Protein tertiary structure results from hydrophobic interactions and disulfide bonds that
`stabilize folding of the secondary structure into a compact overall arrangement, or
`conformation. Large proteins often contain distinct domains, independently folded