`
`Discrete molecular dynamics studies of the folding of a
`protein-like model
`Nikolay V Dokholyan1, Sergey V Buldyrev1, H Eugene Stanley1 and
`Eugene I Shakhnovich2
`
`Background: Many attempts have been made to resolve in time the folding of
`model proteins in computer simulations. Different computational approaches
`have emerged. Some of these approaches suffer from insensitivity to the
`geometrical properties of the proteins (lattice models), whereas others are
`computationally heavy (traditional molecular dynamics).
`
`Results: We used the recently proposed approach of Zhou and Karplus to
`study the folding of a protein model based on the discrete time molecular
`dynamics algorithm. We show that this algorithm resolves with respect to time
`the folding
`unfolding transition. In addition, we demonstrate the ability to
`study the core of the model protein.
`
`Conclusions: The algorithm along with the model of interresidue interactions
`can serve as a tool for studying the thermodynamics and kinetics of protein
`models.
`
`Addresses: 1Center for Polymer Studies, Physics
`Department, Boston University, Boston, MA 02215,
`USA. 2Department of Chemistry, Harvard University,
`12 Oxford Street, Cambridge, MA 02138, USA.
`
`Correspondence: Nikolay V Dokholyan
`E-mail: dokh@bu.edu
`Key words: Go– model, molecular dynamics,
`protein folding
`
`Received: 18 May 1998
`Revisions requested: 23 June 1998
`Revisions received: 07 October 1998
`Accepted: 02 December 1998
`
`Published: 15 December 1998
`http://biomednet.com/elecref/1359027800300577
`
`Folding & Design 15 December 1998, 3:577–587
`
`© Current Biology Ltd ISSN 1359-0278
`
`Introduction
`The vast dimensionality of the protein conformational
`space [1] makes the folding time too long to be reachable
`by direct computational approaches
`[2–4]. Simplified
`models [5–14] became popular due to their ability to reach
`reasonable time scales and to reproduce the basic thermo-
`dynamic and kinetic properties of real proteins [3,15,16]:
`firstly, a unique native state, that is, there should exist a
`single conformation with the lowest potential energy; sec-
`ondly, a cooperative folding transition (resembling first
`order transition); thirdly, thermodynamic stability of the
`native state; and fourthly, kinetic accessibility, that is, the
`native state should be reachable in a biologically reason-
`able time [12,17].
`
`Monte Carlo (MC) simulations on lattice models (e.g. see
`[4–7] and references therein) appear to be useful for
`studying theoretical aspects of protein folding. The MC
`algorithm is based on a set of rules for the transition from
`one conformation
`to another. These
`transitions are
`weighted by a transition matrix that reflects the phenom-
`ena under study. The simplicity of the algorithm and the
`significantly small conformational space of the protein
`models (due to the lattice constraints) make MC on-lattice
`simulations powerful tools for studying the equilibrium
`dynamics of protein models; however,
`lattice models
`impose strong constraints on the angles between the cova-
`lent bonds, thereby greatly restricting the conformational
`space of the protein-like model. The additional drawback
`
`in the poor capability of these
`lies
`of this restriction
`models to discern the geometrical properties of the pro-
`teins. The time in MC algorithms is estimated as the
`average number of moves (over an ensemble of the
`unfolding transitions) made by a model
`folding
`protein. It was pointed out [18] that MC simulations are
`equivalent to the solution of the master equation for the
`dynamics, so there is a relation between physical time and
`computer time, which is counted as the number of MC
`steps; however, a number of delicate issues — such as the
`dependence of the dynamics on the set of allowed MC
`moves — remain outstanding, so an independent test of
`the dynamics using
`the molecular dynamics
`(MD)
`approach is needed.
`
`to geometrical
`that are sensitive
`To address aspects
`details, it is useful to study off-lattice models of protein
`folding. Thus far, several off-lattice simulations have been
`performed [19–21] that demonstrate the ability of the sim-
`plified models to study protein folding.
`
`Here, we study the three-dimensional molecular dynamics
`of a simplified model of proteins [6,7]. The potential of
`interactions between pairs of residues is modeled by a
`‘square-well’, which allows us to increase the speed of the
`simulations [22,23]. We estimate folding time based on
`the collision event
`list, which, besides
`increasing the
`speed of the simulation, allows for the tracking of ‘realis-
`tic’ (not discretized) time. We show that such an algorithm
`Petitioner Microsoft Corporation - Ex. 1055, p. 577
`
`
`
`+∞
`
`,
`
`−
`
`,
`
`⎧ ⎨⎪⎪ ⎩⎪⎪
`
`578 Folding & Design Vol 3 No 6
`
`FFgure 1
`
`(a)
`
`U(r)
`
`(b)
`
`U(r)
`
`=
`
`Ui, j
`
`≤
`−
`
`r
`r
`i
`j
`(
`)
`Δ ⑀
`−
`sign
`0 (2)
`,
`ij
`
`a
`0
`
`r
`
`j
`
`a
`1
`
`r
`i
`>
`
`j
`
`≤
`
`a
`1
`
` <
`a
`0
`−
`
`r
`
`r
`i
`
`a0
`
`a1
`
`b0
`
`b1
`
`Here a0/2 is a radius of the hard sphere, and a1/2 is the
`radius of the attractive sphere (Figure 1a) and ⑀ sets the
`energy scale. ⏐⏐Δ⏐⏐ is a matrix of contacts with elements:
`
`⎧⎨⎪ ⎩⎪
`
`Δ ij
`
`≡
`
`≤
`−
`S
`S
`
`1 ,
`a
`r
`r
`1 (3),
`−
`−
`>
`a
`r
`r
`
`1 1
`
`jN
`
`jN
`
`S
`
`iN
`
`iN
`
`S
`
`
`
`NS is the position of the ith residue when the
`where ri
`protein is in the native conformation. Note that we penal-
`ize the non-native contacts by imposing Δ
`ij < 0. The para-
`meters are chosen as follows: ⑀ = 1, a0 = 9.8 and a1 = 19.5.
`The covalent bonds are also modeled by a square-well
`potential (Bellemans’ bonds):
`
`Folding & Design
`
`The potential of interaction between (a) specific residues and
`(b) neighboring residues (covalent bond). a0 is the diameter of the
`hard sphere and a1 is the diameter of the attractive sphere. [b0,b1] is
`the interval in which residues that are neighbors on the chain can ] ove
`freely.
`
`can be a useful compromise between computationally
`heavy traditional MD and fast but restrictive MC. We
`demonstrate that the model protein reproduces the princi-
`pal features of folding phenomena described above.
`
`We also address the issue of whether we can study the
`equilibrium properties of the core. The core is a small
`subset of the residues that maintains the backbone of the
`structure at temperatures close to the folding transition
`temperature (here the Θ-temperature, Tθ). We emphasize
`the difference between the core and the nucleus of a
`protein: whereas the core is a persistent part of the struc-
`ture at equilibrium, the nucleus is a fragment of this struc-
`ture that is assembled in the transition state (TS) — the
`folding
`unfolding barrier (see Figure 1 in [4]). Based
`on simple arguments, we estimate Tθ for our model [24]
`and compare it with the value found in the simulations.
`
`The model
`We study a ‘beads-on-a-string’ model of a protein. We
`model the residues as hard spheres of unit mass. The
`potential of interaction between residues is ‘square-well’.
`We follow the Go– model [5–7], where the attractive poten-
`tial between residues is assigned to the pairs that are in
`contact (Δ
`ij, defined below) in the native state and repul-
`sive potential
`is assigned to the pairs that are not
`in
`contact in the native state. Thus, the potential energy is:
`
` (1)
`
`=∑
`
`
`
`Ui j,
`1
`
`
`
`i j,
`
`N
`
`1 2
`
`E=
`
`where i and j denote residues i and j. Ui,j is the matrix of
`pair interactions:
`
`r
`i
`≤
`
`+
`1
`b
`0
`
`r
`i
`r
`i
`
`<
`,
`b
`0
`+∞ −
`,
`r
`i
`
`⎧⎨⎪ ⎩⎪
`
`=
`
`V
`,
`i i
`
`+
`1
`
`+
`1
`
`−
`
`0 or (4)
`
`<
`
`b
`1
`
`,
`
`−
`
`r
`i
`
`r
`i
`
`+
`1
`
`≥
`
`b
`1
`
`The values of b0 = 9.9 and b1 = 10.1 are chosen so that
`average covalent bond length is equal to 10 (Figure 1b).
`The original configuration of the protein (N = 65 residues)
`was designed by collapse of a homopolymer at low temper-
`ature [20,25,26]. It contains n* = 328 native contacts, so
`NS = –328. The 65 × 65 matrix of contacts of the globule
`E
`in the native state is shown in Figure 2a. Note that the
`large number of native contacts (328/65 ≈ 5 contacts per
`≈ 2a0 —
`residue) is due to the choice of the parameter: a1
`so that residues are able to establish contacts with the
`residues in the second neighboring shell. The radius of
`≈ 22.7. A
`gyration of the globule in the native state is RG
`snapshot of the globule in the native state is shown in
`Figure 2b.
`
`The program employs the discrete MD algorithm, which
`is based on the collision list and is similar to one recently
`used by Zhou et al. [22] to study equilibrium thermody-
`namics of homopolymers and by Zhou and Karplus [23] to
`study equilibrium thermodynamics of folding of a model
`of Staphylococcus aureus protein A. A detailed description of
`the algorithm can be found in [27–30]. To control the tem-
`perature of the protein, we introduce 935 particles that do
`not interact with the protein or with each other in any way
`but via regular collisions, serving as a heat bath. Thus, by
`changing the kinetic energy of those ‘ghost’ particles, we
`are able to control the temperature of the environment.
`The ghost particles are hard spheres of the same radii as
`the chain residues and have unit mass. Temperature is
`measured in units of ⑀/kB. The time unit (tu) is estimated
`
`Petitioner Microsoft Corporation - Ex. 1055, p. 578
`
`
`
`
`Research Paper Discrete molecular dynamics studies of the folding of a protein-like model Dokholyan et al. 579
`
`FiguFe 2
`
`(a) 65 × 65 contact matrix of the model protein in the native state.
`Black boxes indicate the matrix elements of those residue pairs that
`have a contact (their relative distance is between a0 and a1). (b) A
`snapshot of the protein of 65 residues in the native state obtained at
`temperature T= 0.1.
`
`from the shortest time between two consequent collisions
`in the system between any two particles.
`
`Results
`In order to study the thermodynamics, we performed MD
`simulations of the chain at various temperatures. We start
`with the globule in the native state at temperature T = 0.1
`and then raise the temperature of the heat bath to the
`desired one. Then we allow the system to equilibrate. At
`the final temperature, we let the protein relax for 106 time
`units. The typical behavior of the energy E and the radius
`of gyration RG as functions of time is shown in Figure 3 for
`three different temperatures.
`
`In the present model, the non-native contacts (NNCs) are
`penalized, that is, the pairwise interaction between NNCs
`is repulsive (this corresponds to g = 2 in [23]), so their
`number increases as the temperature increases. At high
`temperatures (above Tθ), however, the number of NNCs
`varies only due
`to
`the
`random
`temperatures. The
`maximum number of NNCs occurs at Tθ and does not
`exceed 35, which is roughly 10% of the total number of
`native contacts (NCs).
`
`The simulations reveal that the protein undergoes a
`folding
`unfolding transition as we increase the tem-
`perature to the proximity of the Θ-temperature Tθ, which
`≈ 1.46. At Tθ, the distribution of
`in this model is Tθ≡ Tf
`energy has three peaks (Figure 4a). The left peak corre-
`sponds to the folded state, the right peak corresponds to
`the unfolded state and the middle one corresponds to the
`partially folded state (PFS), with a 19-residue unfolded
`tail. This trimodality of the energy distribution is also seen
`in Figure 3b. The energy profile at temperature T = 1.42
`(close to Tθ) also reflects these three states. Since T < Tθ,
`only two states are mostly present in Figure 3b. Thus, the
`energy distribution has only two peaks (Figure 5), corre-
`sponding to the folded state and the PFS. Above Tθ, the
`globule starts to explore energetic wells other than the
`native well (see Figure 13 in [31]).
`
`To show that the PFS is the cause of the middle peak in
`energy distribution (Figure 4a), we eliminate the 19-
`residue tail and plot the energy distribution for the 46-mer
`at Θ-temperature Tθ
`∗= 1.44 (Figure 6). We expect to see
`only two states — folded and unfolded — because the 19-
`residue tail, which is the cause of the PFS, is eliminated.
`Figure 6 confirms our expectations.
`
`unfolding transition is further quantified
`The folding
`in Figure 7. The energy and the radius of gyration increase
`most rapidly near Tf = Tθ resembling the order parameter
`jump in a phase transition (see discussion below). This
`rapid increase of E and Rg reaches its maximum at the Θ-
`point, where the potential of interaction is compensated
`by the thermal motion of the particles. Above Tθ, interac-
`tions between residues do not hold them together any
`more and the chain becomes unfolded (see Figure 8a).
`Note that as all the attractive interactions are specific, the
`transition is described by one temperature, Tf .
`
`The presence of the PFS is observed in the temperature
`range, 1.40 to 1.48, in which the collapse transition occurs.
`Thus, in this particular region, Tf and Tθare indistinguish-
`able within the accuracy of their definitions.
`
`Remarkably, a simple Flory-type model of an excluded
`volume chain predicts Tθ within 20%. To demonstrate
`this, let us write the probability that the end-to-end type
`distance of the chain is R [24]:
`
`
`
`P R( )
`
`∝
`
`−
`p R( )exp(
`
`
`2
`N v
`32
`R
`
`−
`
`E
`
`R( )
`T
`
`)
`
` (5)
`
`where v = (4π/3(a0/2)3 is the volume of the monomer and
`p(R) ∝ R2 exp(–3R2/(2N(a0/2)2)) is the probability that the
`end-to-end distance of the chain is R for the random walk
`model. For T = Tf, the repulsive excluded volume term
`term –E(R)/T > 0.
`–(N2v)/(2R3) balances
`the attractive
`Thus:
`Petitioner Microsoft Corporation - Ex. 1055, p. 579
`
`
`
`The dependence on time of (a) energy E and
`(b) radius of gyration RG. The globule is
`maintained at three different temperatures
`T= 0.78 < Tf,T= 1.42, and T= 1.63 > Tf for
`106 tu. For T= 0.78, the fluctuations of both
`energy E and RGare small, that is, the globule
`is found in one folded configuration. At high
`temperatures (T= 1.63) the fluctuations of E
`and RGare large; the globule is mostly found
`in the unfolded state. At the temperature
`T= 1.42, which is close to Tf, the globule is
`mostly present in two states. The lower
`energy configuration corresponds to the
`folded state: the globule is compact – see (b).
`The other configuration has large fluctuations:
`the globule is in the PFS. There is an
`additional state: the unfolded state – see (b).
`At T= 1.42 the protein model is rarely present
`in the unfolded state. Thus, the behavior of the
`globule at temperatures close to Tf indicates
`the presence of three distinct states: folded,
`unfolded and PFS.
`
`T=1.63
`
`T=1.42
`
`T==.78
`
`580 Folding & Design Vol 3 No 6
`
`Figure 3
`
`(a)
`
`–70
`
`–120
`
`–170
`
`–220
`
`–270
`
`Energy
`
`–320
`
`=
`
`1
`
`2
`
`3
`
`4
`
`5
`t/105
`
`6
`
`7
`
`8
`
`9
`
`10
`
`(b)
`
`100
`
`G
`
` R
`
`70
`
`40
`
`10
`100
`
`70
`
`40
`
`10
`
`100
`
`70
`
`40
`
`10
`
`T=1.63
`
`T=1.42
`
`T==.78
`
`0
`
`1
`
`2
`
`3
`
`4
`
`6
`
`7
`
`8
`
`9
`
`5
`t/105
`
`10
`Folding & Design
`
`1.7 (6)
`
`≈
`
`E
`
`v
`
`3 2
`
`2
`
`R N
`
`T
`
`f =
`
`where E ⯝ –130 and R ⯝ 24 are taken for a certain config-
`uration at the Θ-point.
`
`We also compute the heat capacity CV from the relation
`[32]:
`
`at a fixed temperature. The dependence of the heat
`capacity on temperature is shown in Figure 7b. There is a
`pronounced peak of CV (T) for T = Tf .
`
`We note that below the folding temperature Tf , the
`globule (Figure 8b) spends time in a state structurally
`similar to the native state (Figure 2b); however, one can
`see that even though the globule maintains approxi-
`mately the same structure, that is, the same set of NCs,
`the distances between residues are much larger than in
`the native state. Due to the fact that the potential of
`interaction between like residues is a square-well, there
`is no penalty for these residues to be maximally sepa-
`rated, yet they remain within the range of attractive
`Petitioner Microsoft Corporation - Ex. 1055, p. 580
`
`(
`
`)
`δE 2
`
`2
`
`V =
`C
`
` (7)
`
`T
`where 〈(δE)2〉 ≡ 〈E2〉 – 〈E〉2 and 〈...〉 denotes a time average.
`The time average is computed over 106 tu of equilibration
`
`
`
`Research Paper Discrete molecular dynamics studies of the folding of a protein-like model Dokholyan et al. 581
`
`Figure 4
`
`The probability distribution of (a) the energy
`states E and (b) the radius of gyration RGof
`the globule maintained at Tf= 1.46 for 106 tu.
`The trimodal distributions indicate the
`presence of three states: the folded state, the
`PFS, and the unfolded state.
`
`(a)
`
`0.015
`
`(b)
`
`0.10
`
`0.05
`
`0.01
`
`0.005
`
`Probability
`
`0
`–300
`
`–200
`Energy
`
`–100
`
`0.00
`20 30 40 50 60 70 80 90
`RG
`
`Folding & Design
`
`interaction. This allows the globule to have more NNC
`and, thus, still maintain its similarity to native structure,
`yet to have energy larger than the energy of the native
`state. This structure can be identified as the highest in
`
`energy that still maintains its core. As the temperature
`
`increases, the ratio ⏐RG – RGNS⏐/RG
`NS increases until the
`temperature reaches T = Tf , where the ratio becomes
`roughly 0.87.
`
`Tθ* = 1.44
`
`Figure 6
`
`0.015
`
`0.01
`
`0.005
`
`Probability
`
`T=1.25
`T=1.42
`T=1.73
`
`Figure 5
`
`0.040
`
`0.030
`
`0.020
`
`0.010
`
`Probability
`
`0.000
`–300
`
`–250
`
`–150
`–200
`Energy
`
`–50
`–100
`Folding & Design
`
`0
`=200
`
`=150
`
`=100
`Energy
`
`=50
`
`0
`Folding & Design
`
`The probability distribution of the energy states E of the globule
`maintained at three different temperatures: T= 1.25, 1.42 and 1.73.
`Note that at T= 1.42 ≈ Tf the distribution has two expressed peaks.
`The right peak of this (T= 1.42) distribution corresponds to the PFS
`whereas the left Feak corresFonds to the energetic well of the native
`state.
`
`The Frobability distribution of the energy states E of the 46-residue
`globule maintained at Tf* = 1.44 during 106 tu. The bimodal
`distributions of energy indicate that the 19-residue tail is responsible
`for the PFS of the 65-residue globule: after eliminating the 19-residue
`tail, the trimodal energy distribution of the 65-residue globule becomes
`a bimodal energy distribution of the 46-residue globule.
`
`Petitioner Microsoft Corporation - Ex. 1055, p. 581
`
`
`
`Figure 8
`
`582 Folding & Design Vol 3 No 6
`
`Figure 7
`
`(a)
`
`–78
`
`–128
`
`–178
`
`–228
`
`–278
`
`Energy
`
`–328
`0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1
`T
`
`A snapshot of the protein in (a) the unfolded state, obtained at high
`temperature T= 1.8; and (b) the transition state, obtained at folding
`transition temperature Tf= 1.46 (green), overlapped with the globule at
`low temperature T= 0.4 (red). Note that the TS globule has a close
`visual similarity to those maintained at low temperature and in the
`native state (see also Figure 2b). It is more dispersed, however, which
`makes all the NCs easily breakable. To compare the globule at the TS
`with the one maintained at temperature T= 0.4, we perform the
`transformation proposed by Kabsch [41] to minimize the relative
`distance between the residues in the TS and the state at T= 0.4. The
``cold' residues (grey spheres) denote residues whose rms
`displacement is smaller than a1.
`
`the core, we calculate
`the presence of
`To confirm
`f ≡ NNC/NC at temperatures below T = Tf. The attractive
`–E/T dominates
`interresidue
`interaction
`term
`the
`excluded
`volume
`repulsion
`term
`–N2v/(2R3)
`(see
`Equation 5), so:
`
`0 (8)
`
`−
`
`2
`N v
`32
`R
`
`>
`
`E T
`
`f
`
`−
`
`The total energy E has contributions from both NC and
`NNC contacts, so:
`
`E= −
`
`(
`
`⑀
`
`N
`
`NC
`
`−
`
`N
`
`NNC
`
`= −
`
`)
`
`[
`
`2
`
`f
`
`−
`
`]
`1
`
`⑀
`
`N
`
`
`
`C
`
`( )
`9
`
`≈ 0.3, the
`⏐Tf
`At a temperature slightly below Tf ,⏐T – Tf
`residues are maximally separated within their potential
`wells, yet they still maintain contacts. Therefore, the
`v~
`volume
`spanned
`by
`one
`residue
`is
`roughly
`v~ ≈ (4π/3)(a1/2)3 = 8v. NC is the product of the probability
`v~/R3 of having a bond (NC or NNC) and the total number
`Petitioner Microsoft Corporation - Ex. 1055, p. 582
`
`(b)
`
`1000
`
`800
`
`600
`
`400
`
`200
`
`CV
`
`0
`0.3
`
`0.5
`
`0.7
`
`0.9
`
`1.1
`
`(c)
`
`70
`
`Tf
`
`1.5
`
`1.7
`
`1.9
`
`2.1
`
`1.3
`T
`
`60
`
`50
`
`40
`
`30
`
`RG
`
`20
`0.3
`
`0.5
`
`0.7
`
`0.9
`
`1.1
`
`1.3
`T
`
`1.5
`
`1.7
`
`2.1
`1.9
`Folding & Design
`
`The dependence on temperature of (a) the energy E, (b) the heat
`capacity CVand (c) the radius of gyration RG. The error bars are the
`standard deviation of fluctuations. The rapid increase of energy as well
`as the sharp peak in heat capacity at T= Tf indicates a first-order
`phase transition.
`
`
`
`Research Paper Discrete molecular dynamics studies of the folding of a protein-like model Dokholyan et al. 583
`
`centers of mass of these configurations at the same point
`in space.
`is a rotation matrix that minimizes the rela-
`tive distance between the residues of two configurations
`(for details, see [41–44]). The σ
`in Equa-
`i(T) values
`tion 12 are the root-mean-square (rms) displacements for
`each individual residue.
`
`tR
`
`The plot of 〈σ
`i(T)〉 is presented in Figure 9a — from the
`roughness of the ‘landscape’, we can select a group of
`residues whose rms displacements are significantly smaller
`than the rms displacements of the other group of residues.
`We denote the former group by ‘cold’ residues and the
`latter group by
`‘hot’ residues. The rms displacement
`strongly depends on the temperature near the folding
`transition and grows slowly below Tf . Note that the
`average numbers of NC of the residues correlate with the
`average rms displacement of these residues, that is, the
`peak on the NNC,i isothermal lines of Figure 9b correspond
`to the ‘cold’ residues.
`
`Next we calculate the rms displacement σ
`C(T) for the
`selected 25% coldest residues (the core) and σ
`O(T) for the
`rest of the residues. Figure 10 shows their dependence on
`temperature, as well as the dependence of the rms dis-
`placement for all residues σ(T). There is a pronounced dif-
`ference in the behavior of the rms displacement of the
`core residues and the rest of the residues below Tf . At Tf
`their behavior is the same, due to the fact that all the
`attractive interactions are balanced by the repulsion of the
`excluded volume. Above Tf ,
`the difference between
`(T) and σ
`O(T) is due only to the fact that the core
`residues have most of the NCs and, therefore, are more
`likely to spend time together even at T > Tf .
`
`σC
`
`To study the behavior of the globule at Tf , we subdivide
`the probability distribution of the energy states E of the
`globule maintained at Tf = 1.46 during 106 tu into five
`regions: A, B, C, D, and E (Figure 11a). Region A corre-
`sponds to the folded state; region B corresponds to the
`transitional state between the folded state and the PFS;
`region C corresponds to the PFS; region D corresponds to
`the transitional state between the PFS and the com-
`pletely unfolded state (region E). Next we plot the rms
`displacement for each residue for each of the above
`regions (Figure 11a). Note that in region A, all residues
`stay in contact; in region C, both N- and C-terminal tails
`break away forming a PFS; in region D, there are only a
`few core residues that still stay intact and in region E
`none of the residues is in contact. In region B, we observe
`that some of the C-terminal tail residues are not
`in
`contact, indicating the formation of a PFS. Next, we plot
`the dependence of the selected 11 core residues (see
`legend to Figure 9) on the average energy of the window
`of the corresponding region (Figure 11c). We observe that
`core residues remain close to one another even in the
`
`Petitioner Microsoft Corporation - Ex. 1055, p. 583
`
`of possible arrangements of the pair contacts between N
`residues, N(N – 1)/2 ≈ N2/2. Thus:
`
`~
` (10)
`v
`
`2 3
`N R
`
`2
`
`N
`
`C =
`
`From Equations 8–10 we can estimate f, the fraction of NC
`at the temperature T ≈ 1.42 < Tf :
`
`0.68 (11)
`
`≈
`
`v v
`
`T
`~ ⑀
`
`1 2
`
`> +
`
`f
`
`Due to the fact that the globule maintains roughly the
`same volume at temperatures slightly below Θ-point,
`Equation 11 implies that approximately 70% of all native
`contacts stay intact in the folded phase (see Figure 5).
`This result is supported by the simulations: at T ≈ 1.42 the
`≈ 28, and the energy E
`number of NNCs is roughly NNNC
`≈ 234,
`is E = –206. Therefore, the number of NCs is NNC
`and the fraction of NCs is f ≈ 0.89, which is even higher
`than the lower limit set by Equation 11. Note that at a
`temperature higher than Tf , the fraction of native contacts
`becomes small due to the fact that in this regime the inter-
`actions are dominated by the excluded volume repulsion.
`This change in the number of NCs from 70% to close to
`zero indicates the presence of the core structure main-
`tained by these 70% of NCs (see Figure 8b and discussion
`below). Above the Θ-point, the globule
`is completely
`unfolded (Figure 8a).
`
`The formation of a specific nucleus during the folding
`transition was
`suggested
`by many
`theoretical
`[2,4,11,13,33–36] and experimental works [37–40]. The
`presence of the core at Tf may arise from a nucleation
`process driving the system from the unfolded state to the
`native state. We find indications of a first order transition.
`We also offer theoretical reasoning for the presence of a
`core (Equation 11) that might indicate the presence of a
`nucleus. Next, we identify the core.
`
`We calculate the mean square displacement σ(T) of the
`globule at a certain temperature from a globule at the
`native state, that is:
`
`=
`
` (12)
`
`1/2
`
`⎤ ⎦⎥⎥
`
`)
`r ttr
`(
`)
`τ
`−
`S
`R r
`r
`i
`
`iN
`
`2
`
`(
`
`N
`
`∑
`
`=
`
`1
`
`i
`
`1/2
`
`⎤ ⎦⎥⎥
`
`(
`
`T
`
`)
`
`σ
`
`2
`i
`
`N
`
`∑
`
`=
`
`1
`
`i
`
`1
`N
`
`⎡ ⎣⎢⎢
`
`1
`N
`
`⎡ ⎣⎢⎢
`
`(
`
`σ
`
`T
`
`)
`
`≡
`
`
`
`where r→i and r→
`
`NS are the coordinates of the residues of
`i
`the globules at two conformations: at some conformation
`at the temperature T and at the native conformation,
`respectively.
`is a translation matrix, which sets the
`
`tT
`
`
`
`584 Folding & Design Vol 3 No 6
`
`Figure 9
`
`(a)
`
`100
`
`10
`
`σi(T)
`
`1
`
`0
`
`10
`
`20
`
`40
`30
`Residue i
`
`50
`
`60
`
`(b)
`
`20
`
`15
`
`10
`
`NC
`
`5
`
`0
`
`0
`
`Core
`Other
`Total
`
`Figure 10
`
`50
`
`40
`
`30
`
`20
`
`10
`
`σC(T), σ0(T), σ(T)
`
`0
`0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 2.1
`T
`
`Folding & Design
`
`The dependence of the rms displacement of the core residues σC(T),
`the rest of the residues σO(T) and all the residues σ(T) on temperature.
`The above quantities are averaged over 106 tu. Note that for the ideal
`first-order phase transition, one would expect σC(T) to be a step
`function; however, as we consider a transition that would be first order
`in the limit of the infinite size, σC(T) exhibits only step-function-like
`behavior. The difference between core residues and other residues is
`that at Tf the average rms displacement of the core residues is smaller
`than 15, which indicates that they are in contact (see the legend to
`Figure 9). On the contrary, the average rms displacement of the non-
`core residues is greater than 15, indicating that these residues are not
`in contact.
`
`control is governed by the ghost particles that are present
`in the system. We find that if the target temperature is
`above 1.1, the globule always reaches the state corre-
`sponding to the native state; however, if the target tem-
`perature
`is 0.96,
`the globule
`reaches
`the
`state
`corresponding to the native state in only ≈ 70% of cases,
`in the time interval of 105 time units. As an example, we
`demonstrate
`in Figure 12 the cooling of the model
`protein from the high temperature state T = 3.0 to the
`low temperature state T = 0.1. The model protein col-
`lapses after 1200 tu.
`
`What is particularly remarkable about Figure 12 is that we
`can follow the kinetics of the collapse. First, the globule
`gets trapped in some misfolded conformation, where it
`stays for about 1000 tu (see Figure 12a) and then it col-
`lapses to the native state. The time behavior of the
`energy, however, can look a bit puzzling. After the rms
`displacement drops to close to 0, indicating the native
`state, the energy is still higher than that of the native state
`for about 104 tu (see Figure 12b). The key to resolving
`this puzzle is the fact that after the collapse of the model
`protein, its potential energy transforms to kinetic energy,
`
`Petitioner Microsoft Corporation - Ex. 1055, p. 584
`
`2.0
`
`T
`
`T
`
`0.1
`
`0.1
`
`2.0
`
`10
`
`20
`
`40
`30
`Residue i
`
`50
`
`60
`Folding & Design
`
`(a) The contour plot of the rms displacement σi(T) for each residue
`i= 0, 1, º , 64 at temperatures T= 0.3, 0.97, 1.34, 1.46 (bold line)
`and 1.54, averaged over 106 tu. Note that there is a distinct difference
`between the `cold' (small values of σi(T)) and `hot' (large values of
`σi(T)) residues. The horizontal line indicates the breaking point of the
`NCs, that is, when σi(T) is of the size of the average relative position
`between pairs of residues, that is, σi(T) = (a0 + a1)/2 ≈ 15. The bold
`lines – in both (a) and (b) – indicate the folding transition temperature
`line Tf. It is worth noting that 11 residues are still in contact – marked
`by circles on (a): 16, 23, 24, 25, 26, 27, 28, 29, 37, 38 and 39. (b) An
`analogous plot to (a) of the average number of NCs for each residue.
`Note that the number of NCs correlates strongly with therms: the local
`minima of the 〈σi(T)〉 plots correspond to the local maxima of the
`number of NCs.
`
`second transitional state D between the PFS and the
`completely unfolded state.
`
`We also study the system by cooling it from the high
`temperature state. This technique corresponds to simu-
`lated annealing, due to the fact that the temperature
`
`
`
`Research Paper Discrete molecular dynamics studies of the folding of a protein-like model Dokholyan et al. 585
`
`(a)
`
`0.010
`
`0.008
`
`0.006
`
`0.004
`
`0.002
`
`Probability
`
`A
`
`B
`
`C
`
`D
`
`E
`
`0.000
`–300
`
`–250
`
`–200
`
`–150
`Energy
`
`–100
`
`–50
`
`0
`
`(b)
`
`60
`
`FiguFe 11
`
`(a) The probability distribution of the energy states E of the globule
`maintained at Tf= 1.46 during 106 tu. The probability distribution is
`divided into five regions: A, B, C, D and E. Region A corresponds to
`the folded state; region B corresponds to the transitional state
`between the folded state and the PFS; region C corresFonds to the
`PFS; region D corresFonds to the transitional state between the PFS
`and the comFletely unfolded state (region E). (b) The Flot of the rms
`disFlacement σi(T) for each residue i= 0, 1, º , 64 for regions A, B, C,
`D and E from the plot in (a) averaged over 106 tu. Note that in region
`A, all residues stay in contact; in region C, both N- and C-terminal tails
`break away, forming a PFS; in region D, there are only a few of the
`core residues that are still intact; and in region E, none of the residues
`is in contact. (c) The deFendence of the rms disFlacement of the core
`residues (circles) and the other residues (squares) on the average
`energy E of the window of the corresFonding region. Note that core
`residues stay intact even in the second transitional state D between
`the PFS and the comFletely unfolded state. The horizontal lines in (b,c)
`indicate the breaking point of the NCs (see Figure 9).
`
`Tf is larger then 70%, consistent with the presence of the
`core. The nucleus forms in the unstable transition state.
`From the transition state, the globule jumps either to the
`completely unfolded conformation or to the folded con-
`formation.
`
`Our simulations are in agreement with the recent work
`of Zhou and Karplus [23]. They performed discrete MD
`simulations of S. aureus protein A, the
`interresidue
`interactions of which were modeled based on the Go–
`model [5–7]. The pair residues of the model protein,
`which form native contacts, had ‘square-well’ potential
`of interaction with the depth of the well equal to BN⑀,
`whereas all other pair residues had ‘square-well’ poten-
`tial of interaction with the depth of the well equal to
`BO⑀. They characterized the difference between NCs
`and NNCs by the ‘bias gap’, g: g = 1 – BO/BN. Zhou and
`Karplus found that when g = 1.3, that is, when the inter-
`action between NCs is of the opposite sign to the inter-
`action between NNCs, there is a strong first-order-like
`transition from the random coil to the ordered globule.
`The case with our globule corresponds to g = 2, where,
`according to the work of Zhou and Karplus, there should
`exist a strong first-order-like transition from the random
`coil to the ordered globule without intermediate.
`
`We also select the core residues and show that their rms
`displacement behaves significantly differently to the
`behavior of the rms displacement of the rest of the
`residues and exhibits step-function-like behavior upon
`the change of temperature. Our findings are in agree-
`ment with the recent experimental study of the equilib-
`rium hydrogen exchange behavior of cytochrome c of Bai
`et al. [39], who investigated the exposure of the amide
`hydrogens (NH) in cytochrome c to solvent (due to local
`and global unfolding fluctuations). The experiments
`were based on the properties of the amide hydrogens
`Petitioner Microsoft Corporation - Ex. 1055, p. 585
`
`A (E<214)
`B (–196>E>–214)
`C (–140>E>–196)
`D (–1>0>E>–140)
`E (–E>–1>0)
`
`40
`
`20
`
`0
`
`40
`
`0 5 10 15 20 25 30 35 40 45 50 55 60 65
`Residue
`
`>0
`
`20
`
`10
`
`0
`–250
`
`Core
`Other
`
`–200
`
`–150
`
`Energy
`
`–100
`Folding & Design
`
`which slowly decreases by thermal equilib