`
`Gunnar Henriksson,
`Ann-Kristin Englund*
`Gunnar Johansson
`Per Lundahl
`
`Department of Biochemistry,
`Biomedical Center, Uppsala
`University, Uppsala, Sweden
`
`Calculation of isoelectric points
`Calculation of the isoelectric points of native proteins
`with spreading of pKa values
`
`1377
`
`The isoelectric points (pl) of native proteins are important in several separa-
`tion techniques. For estimating p l values the net charge of several proteins was
`calculated versus pH by use of the Henderson-Hasselbalch equation. Amino
`acid composition, pK, values for amino acid side chains and for the N- and
`C-terminal groups, and the presence of other charged groups were taken into
`account. A set of pK, values was chosen for amino acid residues with ionizable
`side chains. Each particular type of ionizable group was assumed to have pK,
`values distributed around the chosen value, thereby simulating the situation in
`proteins and polypeptides. The calculated p l values showed reasonably good
`agreement with experimental ones for most of 16 native proteins over a wide
`pH range (3.4-11) when charge contributions of heme groups, sialic acid resi-
`dues, etc., were taken into account. The calculated pZ for the human red cell
`glucose transporter (Glutl) with one sialic acid residue was decreased from 8.8
`to 8.5 by introducing pK, value spreading and became consistent with the
`experimental plvalue of 8.4 f 0.05 at 15°C determined in the presence of 6 M
`urea. The p l of the native Glutl was lower, 8.0 f 0.1, at 22°C. In general, the
`p l values for native proteins are affected by the three-dimensional structure of
`the proteins, which causes greater differences between calculated and experi-
`mental plvalues than in the case of polypeptides for which plvalues are deter-
`mined in the presence of urea.
`
`1 Introduction
`
`The isoelectric point (pl) of an amphoteric molecule is
`defined as the pH at which the net charge is zero. The
`variation of net charge with pH is of importance in
`charge-dependent
`separation methods
`like electro-
`phoresis, isoelectric focusing, chromatofocusing and ion-
`exchange chromatography. Calculation procedures for
`estimating p l values for proteins or polypeptides have
`been described earlier [l-81. The present study describes
`a simple calculation procedure with spreading of pK,
`values around the chosen values. The procedure was
`applicable to native proteins with good results.
`
`2 Theory
`For a polypeptide of known amino acid composition an
`approximate p l value can be calculated by use of the
`ionization constant pK, for amino acid side'chain groups
`or other types of ionizable groups that may occur. The
`charge for each such group at any given pH was calcu-
`lated by use of the Henderson-Hasselbalch equation,
`x, = 1/[10'p"-pH) + 11
`where x, is the molar fraction of the base form of the
`ionizable group, by taking into account whether the
`charge of the base form is zero (as for -NHJ or -1 (as
`for -COO-). The total net charge at each given pH is ob-
`
`(1)
`
`Correspondence: Dr. Per Lundahl, Department of Biochemistry,
`Biomedical Center, Uppsala University, Box 576, S-751 23 Uppsala,
`Sweden (Tel: +46 18 17 44 59; Fax: +46 18 55 21 39)
`
`Nonstandard abbreviation: Glutl, human red cell glucose transporter
`
`tained by summing up the charge for each type of ioniz-
`able group times the number of groups. In the present
`study, suitable average pK, values were selected for the
`ionizable amino acid side chains, and for the terminal
`groups. The individual ionizable side chains of each type
`of amino acid were assumed to have the pK, values dis-
`tributed around the selected pK, value, thereby simu-
`lating the situation in polypeptides and proteins where a
`given type of ionizable amino acid side chain often
`appears in several positions in the amino acid sequence
`and with various individual ionization constants, depend-
`ing both on the adjacent side chains and on the three-
`dimensional environment in the protein. By assuming a
`distribution of pK, values, the calculated titration curves
`will be smoothed out (Fig. 1). For the calculations
`reported here, the average pK, = pK,,, was used for one
`third of the groups of a given type i, pK,,, + 1 for another
`third, and pK,,, - 1 for the last third. The calculations can
`be performed manually or by use of a computer program.
`In the present approach calculations were performed by
`a program made in GWBASIC on a PC-compatible com-
`puter.
`
`3 Results
`3.1 Sets of pK, values
`
`Three sets of pK, values were considered (A-C in Table
`1). A series of calculations were performed for five pro-
`teins, for which experimentally determined p l were avail-
`able. For several proteins (Table 2), the best agreement
`between calculated and experimental p l values was ob-
`tained with the pK, value set at C . The pK, values for
`His, Cys, Tyr, Lys, and Arg in this set of values are the
`same as those used by Bjellqvist et al. [7], whereas the
`
`Keywords: Isoelectric point / Native proteins / pK, value / Titration
`CUNe
`
`* Present address: Pharmacia AB, Biopharmaceuticals, S-I12
`Stockholm, Sweden
`
`87
`
`@ VCH Verlagsgesellschaft mbH, 69451 Weinheim, 1995
`
`0173-0835/95/0808-1377 $5.00+.25/0
`
`MYLAN INST. EXHIBIT 1101 PAGE 1
`
`MYLAN INST. EXHIBIT 1101 PAGE 1
`
`
`
`G. Henriksson ef al.
`
`Electrophoresis 1995, 16, 1377-1380
`
`1378
`
`50
`40
`30
`20
`
`lo ; -10
`
`m
`P O
`1
`-20
`-30
`do
`-50
`
`2
`
` 3
`
`4
`
`5
`
`6
`
`.
`7
`
`. . _ ,
`8 \ 1 0
`
`,
`
`11
`
`12
`
`I
`13
`
`PH
`Figure 1. Titration curves for the human red cell glucose transporter,
`Glutl, with two sialic acid residues (from fresh red cells), with (solid
`line) and without (hatched line) spreading of the pKa values in
`column C of Table 1 below. The calculated p l values were 8.29 and
`8.67, respectively.
`
`Table 1. DK. values used for calculation of DI values
`Aa)
`Bb)
`Amino acid residue or terminal
`group
`a-COOH
`3.6
`4.47
`ASP
`4.47
`Glu
`6.68
`His
`1.6
`a-NH2
`9.5
`CY s
`10
`TY r
`10
`LY s
`11.9
`Arg
`a) From [9], used for the p1 calculations in [4]
`b) From [lo]
`c) For Asp and Glu, the pKa values were obtained from an NMR
`determination of all ionization constants for carboxyl groups in
`ribonuclease HI [ll]. For a-COOH, His and Lys, the pK, values
`from [9] were modified, whereby a better fit between experimental
`and calculated p l values was obtained.
`
`CC)
`
`3
`3.2
`4.1
`6
`8
`9
`10
`10
`12
`
`3.6
`4
`4.5
`6.4
`8
`9
`10
`10.4
`12
`
`12
`
`10
`
`8
`
`6
`
`4
`
`2
`
`2
`
`8
`6
`4
`10
`Experimental Isoelectric Point
`
`12
`
`Figure 2. Comparison between calculated p l values for the native pro-
`teins listed in Table 2, using the B (A) and C (0) sets of pK, values
`(Table I), without spreading. The middle line represents the ideal cor-
`relation.
`
`over a wide pZ range. The linear regression was improved
`by the pKa value-spreading procedure (from r = 0.943 to r
`= 0.983, Fig. 3) and 12 out of 16 calculated pZ values
`showed better agreement with the experimental value
`after the spreading procedure (Table 3). An even better
`fit might be achieved by using slightly higher pK, values
`for the alkaline side chains. The experimental p l values
`for the few urea-denatured proteins of high p l that were
`included for comparison were consistent with the calcu-
`lated values (Table 3). This was expected, since hidden
`charges become exposed and modifications of pK,
`values of individual ionizable groups due to their three-
`dimensional environment will largely be eliminated
`when the polypeptide unfolds.
`
`4 Discussion
`
`The transmembrane protein Glutl and two cellobio-
`hydrolases were studied in more detail. Membrane pro-
`teins may be difficult to handle in isoelectric focusing
`experiments due to their tendency to self-associate even
`in the presence of detergent. The p 1 of Glutl in the pres-
`ence of urea was determined to be 8.4 [4, 241. Micropre-
`parative free-zone isoelectric focusing of the native pro-
`tein in complex with n-dodecyl octaoxyethylene (C12E8),
`in the absence of urea, showed a pZ of 8.0 [15]. The cal-
`culated p l value 8.46 for Glutl closely corresponds to
`the experimental value obtained in the presence of urea,
`but deviates moderately from the value for the native
`protein. The two exocellulases cellobiohydrolase CBH
`58 (denoted CBH 1 in Ref. 27) and CBH 62 have p l
`values of 3.85 and 4.85, respectively [27]. Therefore the
`cDNA sequence [28] that gave a calculated p l value of
`4.73 probably corresponds to CBH 62.
`
`Cellobiohydrolase I
`
`&-Microglobulin
`
`Guinea pig
`
`Glutl
`
`Human
`
`Phospholipase A2
`
`Common viper
`
`Table 2. Comparison between the sets of pKa values in n b l e 1
`PZ valuea)
`Ab) Bb) Cc) Exp
`Source
`Protein
`Heme domain (CDH) Phanerochaete
`4.21
`3.94 3.34 3.42 [I21
`(4.31) (4.03) (3.38)
`chrysosporium
`Trichoderma reesi 4.75
`4.51
`3.89 3.9
`(4.63) (4.39) (3.79)
`7.35 7.22
`6.88 6.6
`(7.15) (6.97) (6.59)
`8.58 8.59 8.46 8.0
`(8.77) (8.90) (8.81)
`9.24 9.43 9.25 9.2 “[16]
`(9.55) (9.72) (9.55)
`a) Calculated and experimental p l values are given. The pKa value
`spreading was used, except for values given in parenthesis, which
`were calculated without spreading. For experimental pZ values
`reference numbers are given within brackets.
`b), c) See footnote in Table 1
`
`[131
`
`[14]
`
`[15]
`
`pKa values for the C-terminal carboxylate group and the
`Asp and Glu side-chain carboxylate groups are lower
`than the corresponding values used by these authors and
`by Matthew [lo]. The B set of pK, values gave good
`results for prediction of plvalues (Fig. 2), but we chose
`to use the C set for further calculations.
`
`3.2 Spreading of pE, values
`
`The native proteins studied showed reasonably good
`agreement between calculated and experimental values
`
`Procedures for calculation of pZ values for polypeptides
`have recently been published [ 1-81. Calculations of titra-
`
`MYLAN INST. EXHIBIT 1101 PAGE 2
`
`MYLAN INST. EXHIBIT 1101 PAGE 2
`
`
`
`Electrophoresis 1995, 16, 1377-1380
`
`Calculation of isoelectric points
`
`1379
`
`12
`
`10
`
`8
`
`6
`
`4
`
`2
`
`2
`
`Protein
`
`Figure 3. Calculated p l values versus
`experimental pZ values for the native pro-
`teins listed in Table 3. (A) Without
`spreading
`of pKa values. (B) With
`spreading of pKa values. The pK, values
`in column C of Table 1 were used in
`both panels. Linear
`regression
`lines
`(lower ones) and ideal lines (upper ones)
`are shown.
`Table 3. Experimental p l values (with reference numbers within brackets) for (A) native proteins and (B) polypeptides obtained by denaturation
`of proteins with urea, and corresponding calculated p l values without and with spreading of the pKa values in column C of Table 1
`Source
`Code
`Experimental
`Calculated pZ value
`PI value
`Without smeading
`With sureading.
`
`4
`
`6
`
`4
`8
`6
`12
`10
`Experimental Isoelectric Point
`
`8
`
`10
`
`12
`
`P. chlysosporium
`T reesei
`T. reesei
`Human
`T reesei
`T reesei
`Guinea pig
`S. cerevisiae
`Horse heart
`Chicken
`Human
`Sperm whale
`Emperor penguin
`E. coli JC411
`Common Viper
`Human
`
`A. Native proteins
`Heme domain (CDH)a)3b)3C)
`Cellobiohydrolase Ib) ‘)
`Endoglucanase Ib),‘)
`Growth hormone‘)
`Endoglucanase IIb),c’,d)
`Cellobiohydrolase IIb)8e)
`p-2-microglobulinc)
`Pyruvate kinase
`Myoglobina)
`Myoglobina)
`Glutlo
`Myoglobina)
`Myoglobina)
`Colicin E
`Phospholipase A2‘)
`Lysozyme‘)
`B. Polypeptides in urea
`Human
`Glutlo
`Horse heart
`Cytochrome c
`Bovine pancreas
`Ribonucleasec’
`a) Contains a heme group, pKa values 4.0 and 4.8
`b) N-terminus blocked
`c) All cysteines form disulfides
`d) Denoted endoglucanase 111 in [13]
`e) Two free cysteines
`f) From aged red cells, with one sialic acid residue [4], pKa values 2.75
`
`Sw:Bmg-Cavpo
`Sw:Kpyk-Yeast
`Sw:Myg-Horse
`SW:Myg-Chick
`Sw:GtrlLHuman
`Sw:Myg-Phyca
`Sw:MygAptfo
`P:IKECl
`S w : Pa-Vip b b
`
`Sw:Gtrl-Human
`Sw:Cyc-Horse
`P:NRBO
`
`3.42 [12]
`3.9 1131
`4.5
`[13]
`4.9
`[I71
`5.5
`[13]
`5.9
`[13]
`6.6
`[14]
`6.7
`[18]
`7.4
`[19]
`7.7
`[20]
`8.0
`[15]
`8.4
`[19]
`8.5 [21]
`9.05 [22]
`9.2
`[16]
`11
`1231
`
`8.40 [24]
`9.4
`[25]
`9.58 [26]
`
`3.38
`3.79
`3.97
`5.1
`4.12
`4.06
`6.59
`7.79
`6.97
`6.88
`8.81
`7.55
`8.77
`8.35
`9.55
`10.46
`
`8.81
`9.59
`9.37
`
`_ _ _ _ _ _ _ _ _ ~
`
`3.34
`3.89
`4.19
`5.4
`4.40
`4.34
`6.88
`7.53
`7.31
`7.22
`8.46
`7.69
`8.37
`8.00
`9.25
`10.52
`
`8.46
`9.35
`8.81
`
`tion curves and pZ values for polypeptides of known
`composition (sequence) are useful for understanding
`charge properties of the polypeptides or the native pro-
`tein with or without additional charged groups. Our
`simple calculation procedure in most cases showed good
`results for proteins over a wide p l range. A small
`improvement was achieved by spreading the ionization
`constants. We want to emphasize that effects of the
`three-dimensional
`structure
`(hidden
`charges and
`abnormal pK, values) may have caused discrepancies
`between calculated and experimental values, which
`could not be eliminated by the spreading procedure.
`
`We used relatively low pKa values for the Glu and Asp
`side chain carboxyl groups obtained by NMR measure-
`ments on ribonuclease HI [ll]. This protein contains a
`large number of Lys and Arg and thus the protein will
`
`be positively charged at pH close to the pKa values of the
`carboxyl acids. This will create local increase of the pH
`that may decrease the pKa values of some Glu and Asp
`residues to become
`lower than
`the corresponding
`average values in proteins. Nevertheless, this set of pKa
`values gave a better prediction of the pZ than the other
`sets examined (Table 2).
`
`Supported by grants to Goran Pettersson from the Swedish
`Natural Science Research Council and the Swedish Re-
`search Council for Engineering Sciences and by grants to
`Per Lundahl from the Swedish Natural Science Research
`Council and the 0. E. and Edla Johansson Science Founda-
`tion. We are grateful to E. Greijer and E. Brekkan for valu-
`able help and advice.
`
`Received April 26, 1995
`
`MYLAN INST. EXHIBIT 1101 PAGE 3
`
`MYLAN INST. EXHIBIT 1101 PAGE 3
`
`
`
`1380
`
`G . Henriksson ef a/.
`
`5 References
`
`[l] Cameselle, J. C., Ribeiro, J. M., Sillero, A., Biochem. Educ. 1986,
`14, 131-136.
`[2] Skoog, B., Wichman, A,, Trends Anal. Chem. 1986, 5, 82-83.
`[3] Sillero, A., Ribeiro, J. M., Anal. Biochem. 1989, 179, 319-325.
`[4] Englund, A.-K., Lundahl, P., Biochim. Biophys. Acta 1991, 1065,
`185-194.
`[5] Bjellqvist, B., Hughes, G. J., Pasquali, C., Paquet, N., Ravier, F.,
`Sanchez, J.-C., Frutiger, S., Hochstrasser, D., Electrophoresis 1993,
`14, 1023-1031.
`[6] Mosher, R. A,, Gebauer, P., Thormann, W., J. Chromatogr. 1993,
`638, 155-164.
`(71 Bjellqvist, B., Basse, B., Olsen, E., Celis, J. E., Electrophoresis
`1994, 15, 529-539.
`[8] Watts, N. R. M., Singh, R. P., Electrophoresis 1995, 16, 22-27.
`[9] Tanford, C., Adv. Protein Chem. 1962, 17, 69-165.
`[lo] Matthew, J. B., Annu. Rev. Biophys. Biophys. Chem. 1985, 14,
`387-417.
`[ll] Oda, Y., Yamazaki, T., Nagayama, K., Kanaya, S., Kuroda, Y.,
`Nakamura, H., Biochemistry 1994, 33, 5275-5284.
`1121 Henrikson, G., Thesis, Uppsala University, Uppsala 1995.
`[13] Stihlberg, J., Thesis, Uppsala University, Uppsala 1991.
`
`Electrophoresis 1995, 16, 1377-1380
`
`1141 CigBn, R., Ziffer, J. A,, Berggird, B., Cunningham, B. A.,
`Bergghrd, I., Biochemistry 1978, 17, 947-955.
`[15] Englund, A,-K., Lundahl, P., Elenbring, K., Ericson, C., HjertBn,
`S., J. Chromatogr. 1995, in press.
`1161 Boffa, G. A., Boffa, M.-C., Winchenne, J.-J., Biochim. Biophys.
`Acta 1976, 429, 828-838.
`[17] Li, C. H., Mol. Cell. Biochem., 1982, 46, 31-41.
`[18] Aust, A. E., Suelter, C. H., J. Biol. Chem., 1978, 253, 7508-7512.
`[19] Ojteg, G., Nygren, K., Wolgast, M., Acta Physiol. Scand. 1987, 129,
`277-28 6.
`[20] Itoh, T., Satoh, H., Adachi, S., Comp. Biochem. Physiol. 1976, SSB,
`559-561.
`[21] Weber, R. E., Hemmingsen, E. A,, Johansen, K., Comp. Biochem.
`Physiol. 1974, 49B, 197-214.
`[22] Schwartz, S. A., Helinski, D. R., J. Biol. Chem. 1971, 246,
`6318-6327.
`[23] Stryer, L., Biochemistry, W. H. Freeman New York 1981.
`1241 Englund, A.-K., Lundahl, P., Electrophoresis 1993, 14, 1307-1311.
`1251 Heaney, A., Weller, D. L., J. Chem. Educ. 1970, 47, 724-726.
`[26] Ui, N., Biochim. Biophys. Acta 1971, 229, 567-581.
`[271 Uzcategui, E., Ruiz, A., Montesino, R., Johansson, G., Pettersson,
`G., J. Biotechnol. 1991, 19, 271-286.
`1281 Sims, P., James, C., Broda, P., Gene 1988, 74, 411-422.
`
`MYLAN INST. EXHIBIT 1101 PAGE 4
`
`MYLAN INST. EXHIBIT 1101 PAGE 4
`
`