`
`BioMed Central
`
`Open Access
`Research article
`Antibody-protein interactions: benchmark datasets and prediction
`tools evaluation
`Julia V Ponomarenko*1,2 and Philip E Bourne1,2
`
`Address: 1San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA and 2Skaggs
`School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
`
`Email: Julia V Ponomarenko* - jpon@sdsc.edu; Philip E Bourne - bourne@sdsc.edu
`* Corresponding author
`
`Published: 2 October 2007
`
`BMC Structural Biology 2007, 7:64
`
`doi:10.1186/1472-6807-7-64
`
`This article is available from: http://www.biomedcentral.com/1472-6807/7/64
`
`Received: 9 April 2007
`Accepted: 2 October 2007
`
`© 2007 Ponomarenko and Bourne.; licensee BioMed Central Ltd.
`This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
`which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
`
`Abstract
`Background: The ability to predict antibody binding sites (aka antigenic determinants or B-cell
`epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the
`various methods of B-cell epitope identification X-ray crystallography is one of the most reliable
`methods. Using these experimental data computational methods exist for B-cell epitope prediction.
`As the number of structures of antibody-protein complexes grows, further interest in prediction
`methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D
`structure-based epitope prediction methods.
`Results: Two B-cell epitope benchmark datasets inferred from the 3D structures of antibody-
`protein complexes were defined. The first is a dataset of 62 representative 3D structures of protein
`antigens with inferred structural epitopes. The second is a dataset of 82 structures of antibody-
`protein complexes containing different structural epitopes. Using these datasets, eight web-servers
`developed for antibody and protein binding sites prediction have been evaluated. In no method did
`performance exceed a 40% precision and 46% recall. The values of the area under the receiver
`operating characteristic curve for the evaluated methods were about 0.6 for ConSurf, DiscoTope,
`and PPI-PRED methods and above 0.65 but not exceeding 0.70 for protein-protein docking
`methods when the best of the top ten models for the bound docking were considered; the
`remaining methods performed close to random. The benchmark datasets are included as a
`supplement to this paper.
`Conclusion: It may be possible to improve epitope prediction methods through training on
`datasets which include only immune epitopes and through utilizing more features characterizing
`epitopes, for example, the evolutionary conservation score. Notwithstanding, overall poor
`performance may reflect the generality of antigenicity and hence the inability to decipher B-cell
`epitopes as an intrinsic feature of the protein. It is an open question as to whether ultimately
`discriminatory features can be found.
`
`Background
`A B-cell epitope is defined as a part of a protein antigen
`recognized by either a particular antibody molecule or a
`
`particular B-cell receptor of the immune system [1]. The
`main objective of B-cell epitope prediction is to facilitate
`the design of a short peptide or other molecule that can be
`
`Page 1 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 1
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`synthesized and used instead of the antigen, which in the
`case of a pathogenic virus or bacteria, may be harmful to
`a researcher or experimental animal [2]. A B-cell epitope
`may be continuous, that is, a short contiguous stretch of
`amino acid residues, or discontinuous, comprising atoms
`from distant residues but close in three-dimensional space
`and on the surface of the protein.
`
`tivity of antibodies due to the presence of denatured or
`degraded proteins [32,33], or due to conformational
`changes in the protein caused by residue substitutions
`that may even lead to protein mis-folding [34]. Therefore,
`structural methods, particularly X-ray crystallography of
`antibody-antigen complexes, generally identify B-cell
`epitopes more reliably than functional assays [35].
`
`Synthetic peptides mimicking epitopes, as well as anti-
`peptide antibodies, have many applications in the diag-
`nosis of various human diseases [3-7]. Also, the attempts
`have been made to develop peptide-based synthetic pro-
`phylactic vaccines for various infections, as well as thera-
`peutic vaccines for chronic infections and noninfectious
`diseases, including autoimmune diseases, neurological
`disorders, allergies, and cancers [8-10]. The immunoinfor-
`matics software and databases developed to facilitate vac-
`cine design have previously been reviewed [11,12].
`
`During the last 25 years B-cell epitope prediction methods
`have focused primarily on continuous epitopes. They
`were mostly sequence-dependent methods based upon
`various amino acid properties, such as hydrophilicity
`[13], solvent accessibility [14], secondary structure [15-
`18], and others. Recently, several methods using machine
`learning approaches have been introduced that apply hid-
`den Markov models (HMM) [19], artificial neural net-
`works (ANN) [20], support vector machine (SVM) [21],
`and other techniques [22,23]. Recent assessments of con-
`tinuous epitope prediction methods demonstrate that
`"single-scale amino acid propensity profiles cannot be
`used to predict epitope location reliably" [24] and that
`"the combination of scales and experimentation with sev-
`eral machine learning algorithms showed little improve-
`ment over single scale-based methods" [25].
`
`As crystallographic studies of antibody-protein complexes
`have shown, most B-cell epitopes are discontinuous. In
`1984, the first attempts at epitope prediction based on 3D
`protein structure was made for a few proteins for which
`continuous epitopes were known [26-28]. Subsequently,
`Thornton and colleagues [29] proposed a method to
`locate potential discontinuous epitopes based on a pro-
`trusion of protein regions from the protein's globular sur-
`face. However, until the first X-ray structure of an
`antibody-protein complex was solved in 1986 [30], pro-
`tein structural data were mostly used for prediction of
`continuous rather than discontinuous epitopes.
`
`In cases where the three-dimensional structure of the pro-
`tein or its homologue is known, a discontinuous epitope
`can be derived from functional assays by mapping onto
`the protein structure residues involved in antibody recog-
`nition [31]. However, an epitope identified using an
`immunoassay may be an artefact of measuring cross-reac-
`
`B-cell epitopes can be thought of in a structural and func-
`tional sense. Structural epitopes (also called antigenic
`determinants) are defined by a set of residues or atoms in
`the protein antigen contacting antibody residues or atoms
`[33,36]. In contrast, a functional epitope consists of anti-
`gen residues that contribute significantly to antibody
`binding [36,37]. Functional epitopes are determined
`through functional assays (e.g., alanine scanning muta-
`genesis) or calculated theoretically using known struc-
`tures of antibody-protein complexes [38,39]. Thus,
`functional and structural epitopes are not necessary the
`same. Functional epitopes in proteins are usually smaller
`than structural epitopes; only three to five residues of the
`structural epitope contribute significantly to the antibody-
`antigen binding energy [40]. This work focuses on struc-
`tural epitopes inferred from known 3D structures of anti-
`body-protein complexes available in the Protein Data
`Bank (PDB) [41].
`
`Antibody-protein complexes can be categorized as inter-
`mediate transient non-obligate protein-protein com-
`plexes [40,42]. Non-obligate complexes, implying that
`individual components can be found on their own in vivo,
`are classified as either permanent or transient depending
`on their stability under particular physiological and envi-
`ronmental conditions [43]. For example, many enzyme-
`inhibitor complexes are permanent non-obligate com-
`plexes. Transient non-obligate complexes range from
`weak (e.g., electron transport complexes), to intermediate
`(e.g., signal transduction complexes), and to strong (e.g.,
`bovine G protein forming a stable trimer upon GDP bind-
`ing) [44]. Most antibodies demonstrate intermediate
`affinity for their specific antigens [45]. Based on this clas-
`sification, general methods for the prediction of interme-
`diate transient non-obligate protein-protein interactions
`have been applied to the prediction of structural epitopes
`[40,42]. For example, Jones and Thornton, using their
`method for predicting protein-protein binding sites [46],
`successfully predicted B-cell epitopes on the surface of the
`β-subunit of human chorionic gonadotropin (βhCG)
`[47].
`
`Since the number of available structures of antibody-pro-
`tein complexes remains limited, thus far only a few meth-
`ods, CEP (Conformational Epitope Prediction) [48] and
`DiscoTope [49], for B-cell epitope prediction using a pro-
`tein of a given three-dimensional structure have been
`
`Page 2 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 2
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`developed. In the near future, with growth in the number
`of available structures of antibody-protein complexes,
`extensive development in this area is expected. Existing
`and new methods for epitope prediction demand a
`benchmark which will set the standard for the future com-
`parison of methods. To facilitate the further development
`of this standard, we have developed B-cell epitope bench-
`mark datasets inferred from existing 3D structures of anti-
`body-protein complexes. Further, using the benchmark
`datasets, we evaluated CEP, DiscoTope, and six recently
`developed publicly available web-servers for generalized
`protein-protein binding site prediction using various
`approaches: protein-protein docking (ClusPro [50], DOT
`[51] and PatchDock [52]); structure-based methods
`applying different principals and trained on different
`datasets (PPI-PRED [53], PIER [54] and ProMate [55]),
`and residue conservation (ConSurf [56]).
`
`Results and discussion
`Structural epitope definition
`Three definitions of an epitope inferred from the X-ray
`structures of antibody-protein complexes were consid-
`ered: (1) The epitope consists of protein antigen residues
`in which any atom of the residue looses more than 1Å2 of
`accessible surface area (ASA) upon antibody binding. ASA
`was calculated using the program NACCESS [57]; (2) The
`epitope consists of protein antigen residues in which any
`atom of the epitope residue is separated from any anti-
`body atom by a distance ≤ 4Å; (3) The epitope consists of
`protein antigen residues in which any atom of the epitope
`residue is separated from any antibody atom by a distance
`≤ 5Å. These three definitions were used for two reasons.
`First, the methods evaluated in this work use one of these
`three definitions, second, we wished to study how the
`epitope definition influenced the results.
`
`Results (not shown) indicated that the structural epitope
`definition did not influence the outcome. Hence, unless
`otherwise specified, results are based on the second
`epitope definition.
`
`Construction of the benchmark datasets
`Two benchmark datasets were derived from the 3D struc-
`tures of antibody-protein complexes available from the
`PDB [41]:
`
`(cid:127) Dataset #1 – Representative 3D structures of protein
`antigens with structural epitopes inferred from 3D struc-
`tures of antibody-protein complexes. This dataset is
`intended for the study of the antigenic properties of pro-
`teins as well as for development and evaluation of the
`methods based on protein structure alone, or protein-pro-
`tein unbound docking methods, that is, if the structure of
`the antibody is known or can be modeled. Here this data-
`set was used for the evaluation of scale-based methods
`
`(DiscoTope, PIER, ProMate and ConSurf). The dataset
`contains 62 antigens, 52 of which are one-chain antigen
`proteins.
`
`(cid:127) Dataset #2 – Representative 3D structures of antibody-
`protein complexes presenting different epitopes. This
`dataset is useful for the study of the properties of individ-
`ual epitopes as well as for development and evaluation of
`protein-protein bound docking methods. Since the cur-
`rent work attempts to compare the methods of different
`types, including protein-protein docking methods, this
`dataset was used to compare the performance of all meth-
`ods to each other. The dataset contains 70 structures of
`proteins in complexes with two-chain antibodies and 12
`structures of proteins in complexes with one-chain anti-
`bodies.
`
`The flowchart describing the construction of the bench-
`mark datasets is shown in Figure 1. Steps from 1 to 4 relate
`to dataset #1; steps 1–6 relate to dataset #2.
`
`Step 1 – crystal structures of protein antigens of length ≥30
`amino acids at a resolution ≤ 4Å in complex with anti-
`body fragments containing variable regions (Fab, VHH,
`Fv, or scFv fragments) were collected from the Protein
`Data Bank (PDB) [41]. Structures in which the antibody
`binds antigen but involves no CDR residues have been
`excluded from the analysis; there were four such structures
`[PDB: 1MHH, 1HEZ, 1DEE, 1IGC]. If a structure con-
`tained several complexes in one asymmetric unit and
`there was no structural difference observed between these
`complexes, only one complex was selected. In this way
`166 structures containing 187 antibody-protein com-
`plexes were selected: 24 complexes were formed by one-
`chain antibody fragments and 163 complexes by two-
`chain antibody fragments.
`
`Step 2 – all antigen protein chains were structurally
`aligned to one another using the CE algorithm [58]. Two
`protein chains were considered similar if all the following
`conditions applied: (i) rmsd ≤3Å, (ii) z-score ≥4.0, (iii)
`number of residue-residue matches relative to the length
`of the longest chain ≥80%, (iv) sequence identity in the
`structural alignment (not considering gaps) ≥80%. The z-
`score takes into account overall structural similarity and
`number of gapped positions. Two protein molecules were
`considered similar if each chain in one protein had a sim-
`ilar chain in another protein. Figure 2 demonstrates how
`the last two parameters, number of matches and sequence
`identity in the structural alignment, are defined.
`
`The structural alignment rather than sequence alignment
`was used because protein structure is more conserved than
`sequence, and there can be expected regions in proteins
`with low sequence similarity that cannot be aligned by
`
`Page 3 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 3
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`Flowchart for building benchmark datasetsFigure 1
`
`Flowchart for building benchmark datasets.
`
`sequence alone. The structural alignment also avoids con-
`sidering two proteins as similar if they have similar
`sequences but different structures (possible over short
`regions). The threshold values were chosen empirically
`based on previous experience working with the CE algo-
`rithm. As a result, the chosen threshold values separated
`human and bird lysozymes (61% sequence identity) and
`neuraminidases of different influenza virus strains, H3N2
`and H11N9 (47% sequence identity).
`
`Step 3 – 35 proteins were orphans represented by only one
`3D structure. Of the remaining 27 proteins represented by
`more than one 3D structure, the structure with the best
`resolution was selected as the representative structure. The
`final representative dataset contained 62 antigens [see
`Additional file 1], 52 of which were one-chain antigen
`proteins.
`
`Hypothetical example of the structural alignment of proteins (A) (sequence AVCQYWC) and (B) (sequence ACYARTYC)Figure 2
`
`
`Hypothetical example of the structural alignment of
`proteins (A) (sequence AVCQYWC) and (B)
`(sequence ACYARTYC). Number of residue-residue
`matches = 5, number of residue-residue matches relative to
`the length the longest chain = 63% (5/8), sequence identity =
`80% (4/5).
`
`Step 4 – for each protein, epitopes inferred from the 3D
`structures of antibody-protein complexes were mapped
`onto the representative structure of the protein. First,
`epitope residues were calculated for each complex struc-
`ture using one of the aforementioned epitope definitions.
`Second, epitope residues defined for the represented
`structures were mapped onto the representative structure
`based on the structure alignments. For example, the
`hemagglutinin HA1 chain of influenza A virus was repre-
`sented by six 3D structures of the protein in complexes
`with Fab fragments of antibodies HC45 [PDB:1QFU],
`BH151 [PDB:1EO8], HC63 [PDB:1KEN], and HC19
`[PDB:2VIR, 2VIS, 2VIT]. Figure 3 illustrates a representa-
`tive structure [PDB:1EO8] of hemagglutinin HA1 upon
`which epitopes are mapped having been inferred from six
`complex structures. In this way, epitopes inferred from
`187 structures of antibody-protein complexes were
`mapped onto the 62 representative protein structures. The
`resulting dataset is denoted dataset #1. Data on mapped
`epitope residues are available upon request.
`
`Step 5 – to study the properties of individual epitopes and
`their prediction a dataset of representative epitopes, data-
`set #2 derived from 3D structures of antibody-protein
`complexes defining different epitopes was constructed. An
`important question to consider is how to define individ-
`ual epitopes yet avoid bias by over-presentation of partic-
`ular epitopes? For example (Fig. 3), while HC45 (blue)
`and BH151 (magenta) epitopes overlap, neither HC63
`(green) nor HC19 (red) epitopes overlap, they are sepa-
`rated on the protein surface. Nevertheless, HC45 and
`BH151 epitopes share residues (orange in Fig. 3), as do
`HC63 and HC19 epitopes (yellow in Fig. 3). Are HC45
`
`Page 4 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 4
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`Two orthogonal views of a representative structure, influenza A virus hemagglutinin HA1 chain [PDB:1EO8]Figure 3
`
`Two orthogonal views of a representative structure, influenza A virus hemagglutinin HA1 chain [PDB:1EO8].
`Chain A is shown in light gray upon which are mapped epitope residues inferred from six protein structures in complexes with
`antibody fragments: HC45 Fab [PDB:1QFU] (blue), BH151 Fab [PDB:1EO8] (magenta), HC63 Fab [PDB:1KEN] (green), HC19
`Fab [PDB:2VIR, 2VIS, 2VIT] (red). The hemagglutinin HA2 chain is shown in cyan. Residues common to HC45 and BH151
`epitopes are shown in orange; residues common to HC63 and HC19 epitopes are shown in yellow; residue Tyr98 which is a
`part of HC19 epitope inferred from structure 2VIR but not from 2VIS and 2VIT structures is shown in black; The HC19
`epitope residue Thr131 which is mutated to Ile in the 2VIS structure is shown in dark red. The HC19 epitope residue Thr155
`which is mutated to Ile in 2VIT structure is shown in violet.
`
`and BH151 epitopes similar or different? This question is
`answered by considering the degree of overlap.
`
`Two epitopes are deemed similar if, in addition to the
`aforementioned criteria for epitope definition, they
`belong to similar protein chains and have >75% residues
`in common for both epitopes. A cut-off value of 75% for
`epitope similarity was chosen empirically. Thus, the
`HC45 and BH151 epitopes on influenza A virus hemag-
`glutinin HA1 (Fig. 3) share 14 residues, that make up 74%
`and 93% of the size of HC45 and BH151 epitopes, respec-
`tively. A cut-off on epitope overlap of less than 75%
`
`would define HC45 and BH151 epitopes as similar even
`though they are known to be different. HC45 and BH151
`are antibodies from different germ-lines with variable
`domains sharing only 56% sequence similarity, their H3
`CDR regions adopt distinct conformations and these anti-
`bodies are tolerant to different mutations in hemaggluti-
`nin [59]. Another example, X5 and 17B epitopes of gp120
`share 75% of their residues yet X5 and 17B antibodies are
`from different genes [60]. A cut-off value for epitope sim-
`ilarity equal to or less than 75% would erroneously define
`X5 and 17B epitopes as similar. Conversely, a cut-off value
`of 80% would make epitopes inferred from different
`
`Page 5 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 5
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`structures of the same antibody-protein complex dissimi-
`lar. For example, the H57 epitope of T cell receptor N15 is
`inferred from two complex structures of a single crystal
`asymmetric unit ([PDB:1NFD], complexes (D)-(HG) and
`(B)-(FE), where the letters denote protein chain identifi-
`ers) would be dissimilar.
`
`Given a 75% empirical cut-off for epitope similarity,
`epitopes inferred from structures of complexes with two-
`chain antibody fragments were divided into 44 singletons
`and 26 groups; epitopes inferred from structures of com-
`plexes with one-chain antibody fragments were divided
`into ten singletons and two groups.
`
`Step 6 – for each group of similar epitopes, the represent-
`ative 3D structure of the antibody-protein complex was
`selected based upon the following preferences. First, the
`structure with no or a minimal number of heteroatoms
`(excluding water) and other protein chains in the interface
`(i.e., separated from any atoms of both antigen and anti-
`body by ≤4Å distance) was preferred. Second, preference
`was given to the structure with the largest epitope, i.e.,
`maximum number of epitope residues. Third, the struc-
`ture with the best resolution ≤2.5Å was preferred. Dataset
`#2 of representative structures of antibody-protein com-
`plexes (representative epitopes) consisted of 70 structures
`of proteins in complexes with two-chain antibody frag-
`ments and 12 structures of proteins in complexes with
`one-chain antibody fragments.
`
`Web-servers performance evaluation
`Using the benchmark datasets introduced above we eval-
`uated eight recently-developed and publicly available
`
`Table 1: Servers evaluated in this work
`
`web-servers. The servers use different methods yet all have
`the goal of predicting either B-cell epitopes, or more gen-
`erally protein-protein binding sites. The servers are listed
`in Table 1. Any reference in the text to the method actually
`means the server which implements that method, e.g., the
`DOT method running on the ClusPro server is called Clus-
`Pro(DOT).
`
`The methods fall into two categories:
`
`(cid:127) Scale-based methods – each protein residue is assigned a
`value reflecting the probability of that residue being part
`of the protein interface or epitope. DiscoTope, PIER, Pro-
`Mate and ConSurf fall into this category.
`
`(cid:127) Patch prediction and protein-protein docking methods – each
`protein residue is predicted to be part of a surface patch of
`residues defining the protein interface or epitope. Disco-
`Tope, ProMate, CEP, PPI-PRED, ClusPro(DOT), and
`PatchDock fall into this category.
`
`Two methods, DiscoTope and ProMate, fall into both cat-
`egories since they predict patches and assign score values
`to each protein residue.
`
`The evaluation of the methods was performed as follows.
`First, the scale-based methods were analyzed on how well
`the residue score values discriminate epitope versus non-
`epitope residues using dataset #1. Further, performance of
`all methods was evaluated on their ability to recognize
`representative epitopes from dataset #2. The first step is
`obviously not essential; it was performed as an example of
`the application of dataset #1 that can be used for future
`
`Server name
`
`Method type
`
`Training dataset
`
`Reference
`
`CEP (Conformational
`Epitope Prediction)
`DiscoTope
`
`ProMate
`
`PIER (Protein IntErface
`Recognition)
`
`PPI-PRED (Protein-
`Protein Interface
`Prediction)
`ConSurf
`
`ClusPro (DOT
`program)
`PatchDock
`
`Discontinuous epitope prediction based on residue
`solvent accessibility and spatial distribution.
`Discontinuous epitope prediction based on amino
`acid statistics, residue solvent accessibility and spatial
`distribution.
`Protein-protein binding interface prediction based on
`significant structural and sequence interface
`properties.
`Protein-protein binding interface prediction based on
`local statistical properties of the protein surface
`derived at the level of atomic groups.
`Protein-protein binding interface prediction based on
`significant structural and sequence interface
`properties.
`Mapping of phylogenetic information (sequence
`conservation grades) on to the surface of proteins
`with known 3D structure.
`Rigid-body protein-protein docking based on the
`Fast-Fourier Transform correlation approach.
`Rigid-body protein-protein docking based on local
`shape feature matching.
`
`No training set.
`
`75 structures of antibody-antigen complexes.
`
`Manually curated; 57 protein involved in
`heterodimeric transient interactions (excluding
`antigen-antibody complexes).
`490 homodimeric, 62 heterodimeric and 196
`transient interfaces (excluding antigen-antibody
`complexes).
`Manually curated; 180 proteins from 149 complexes
`both obligate (114) and transient (66).
`
`No training set.
`
`No training set.
`
`No training set.
`
`[48]
`
`[49]
`
`[55]
`
`[54]
`
`[53]
`
`[56]
`
`[50] [51]
`
`[52]
`
`Page 6 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 6
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`methods development and for revealing properties of
`epitope residues beyond the fact that epitopes are sites on
`the protein surface.
`
`Scale-based methods: score value distributions
`DiscoTope, PIER, ProMate and ConSurf assign to each
`protein residue a score reflecting the probability of that
`residue being a part of the protein interface or epitope.
`Details are provided in the Methods section. For the anal-
`ysis of epitope residues versus non-epitope residues we
`used dataset #1, that is, representative antigen structures
`with epitopes mapped onto them. Here an epitope resi-
`due is an antigen residue known to be part of an epitope
`in any complex of this antigen with any antibody. Con-
`versely a non-epitope residue implies an antigen residue
`which is not known to be part of a structural epitope. To
`simplify the calculation proteins with epitopes located on
`more than one protein chain were discarded from the
`analyses (there were 10 such proteins). As a result 52 pro-
`tein antigens were analyzed [see Additional file 1].
`
`The score distributions for epitope, non-epitope and all
`protein residues were calculated for each method and are
`shown in Figures 4, 5, 6, 7. Distributions taking into
`account only surface residues were similar for all methods
`(results not shown). The definition of a surface residue is
`given in the Methods section.
`
`DiscoTope, ProMate and ConSurf scores discriminate
`epitope versus non-epitope and versus all protein resi-
`dues, while PIER and ConSurf confidence scores do not.
`Thus, as one can see in Figure 4, DiscoTope discriminates
`x
`epitope residues (
` = -10.2, s = 5.4, number of residues N
`x
`= 1,364) from non-epitope residues (
` = -13.3, s = 6.3, N
`x
`= 9,713) (p < 0.001) and all antigen residues (
` = -13.0,
`s = 6.3, N = 11,077) (p < 0.001). These distributions are
`
`Distribution of ConSurf scores for epitope and all protein residuesFigure 5
`
`
`Distribution of ConSurf scores for epitope and all
`protein residues. For the definition of confidence score
`see the Methods section.
`
`significantly different (p < 0.001) regardless of the epitope
`definition used. The ConSurf conservation score also dis-
`x
`criminates epitope residues (
` = 0.273, s = 1.050, N =
`x
`1,119) versus non-epitope residues (
` = -0.049, s =
`x
`0.987, p < 0.001) and versus all antigen residues (
` = -
`0.007, s = 1.00, N = 8,684, p < 0.001) (Fig. 5). The same
`was true for epitope vs. all surface residues. Further, the
`confidence level did not change when the definition of
`surface residues and/or epitope residues was changed
`(data not shown). However, if only residues with ConSurf
`
`Distributions of DiscoTope scores for epitope, non-epitope and all protein residuesFigure 4
`
`
`Distributions of DiscoTope scores for epitope, non-epitope
`and all protein residues.
`
`Distribution of ProMate scores for epitope, non-epitope and all protein residuesFigure 6
`
`
`Distribution of ProMate scores for epitope, non-epitope and
`all protein residues.
`
`Page 7 of 19
`(page number not for citation purposes)
`
`Lassen - Exhibit 1036, p. 7
`
`
`
`BMC Structural Biology 2007, 7:64
`
`http://www.biomedcentral.com/1472-6807/7/64
`
`the sensitivity and positive predictive values and measure
`the method performance from the relative number of suc-
`cessful predictions in the test dataset [53].
`
`Approaching the task of evaluation and comparison of
`different methods, we encountered a number of ques-
`tions. How can we compare scale-based methods with
`patch prediction and docking methods? DiscoTope and
`ProMate predict one patch per protein, while other meth-
`ods predict several patches, how can these be compared?
`Using a score value assigned by ProMate, DiscoTope, or
`ConSurf to a residue, all epitopes in the protein are taken
`into account, so can we say that the method predicts one
`epitope per protein? Is not the direct comparison of pro-
`tein docking methods (ClusPro (DOT), PatchDock) ver-
`sus patch-based prediction methods
`(DiscoTope,
`ProMate, CEP, PPI-PRED) questionable since the former
`methods are based on optimization of an interaction
`energy function, while the latter depend on training?
`Finally, docking methods require knowledge of the struc-
`tures of both interacting proteins, antigen and antibody,
`while binding site prediction methods are based on the
`structure of the protein antigen alone and do not require
`knowledge of the antibody structure. Is this a fair compar-
`ison? Being aware of these questions and limitations, we
`applied various evaluation criteria in an attempt to pro-
`vide a thorough and fair comparison of the methods.
`
`The evaluation was performed on the dataset of represent-
`ative epitopes, assuming any antigen residue which is not
`a part of a considered epitope is part of a non-epitope. We
`didn't discard non-epitope residues, which we know
`belong to some other epitope in the protein, because we
`assumed that a prediction program will predict an epitope
`in an antigen for which it doesn't have any additional
`information except its sequence and structure – this is
`how all evaluated methods were constructed. The analysis
`was performed using the representative epitopes from
`dataset #2 that were inferred from structures of one-chain
`(monomer) antigens in complexes with two-chain anti-
`body fragments. There were 59 such epitopes in 48 anti-
`gens (Table 2).
`
`The following parameters were used to evaluate the meth-
`ods:
`
`Sensitivity (recall or true positive rate (TPR)) = TP/(TP +
`FN) – a proportion of correctly predicted epitope residues
`(TP) with respect to the total number of epitope residues
`(TP+FN).
`
`Specificity (or 1 – false positive rate (FPR)) = 1 - FP/(TN +
`FP) – a proportion of correctly predicted non-epitope res-
`idues (TN) with respect to the total number of non-
`epitope residues (TN+FP).
`
`Page 8 of 19
`(page number not for citation purposes)
`
`Distribution of PIER scores for epitope, non-epitope and all protein residuesFigure 7
`
`
`Distribution of PIER scores for epitope, non-epitope and all
`protein residues.
`
`confidence score values were considered, no significant
`difference between epitope and other protein residues was
`x
`observed (epitope residues:
` = 0.197, s = 0.539; non-
`x
`epitope residues:
` = 0.194, s = 0.556, p > 0.05). For Pro-
`x
`Mate mean scores for epitope residues (
` = 52.8, s = 25.4,
`N = 1,363) were significantly higher than for all antigen
`x
`residues (
` = 46.5, s = 28.1, N = 11,074) or non-epitope
`residues or all surface residues (p < 0.001) (Fig. 6). The
`PIER score does not discriminate epitope versus other
`x
`antigen residues (epitope residues:
` = 11.9, s = 11.4, N =
`x
`1,363; non-epitope residues:
` = 12.6, s = 13.7; N =
`8,221, p > 0.05) (Fig. 7).
`
`These results suggest that epitope residues are less con-
`servative according to the ConSurf evolutionary conserv-
`ancy scores than protein surface residues in general at a
`99.9% confidence level (p < 0.001). PIER, which is trained
`on 3D structures of all protein-protein complexes availa-
`ble in the PDB, could not distinguish epitopes from the
`rest of the protein surface. One possible explan