throbber
VOLUME 31 NUMBER 13 WEB SERVER ISSUE JULY 1, 2003
`
`Nucleic Acids
`Research
`
`JUL O 8 2003
`science Library
`University of Callfornlr,
`Riverside
`
`OXFORD UNIVERSITY PRESS
`
`ISSN 0305 1048 Coden NARHAO
`
`Miltenyi Ex. 1023 Page 1
`
`

`

`Subscriptions
`
`issues per year).
`is published twice monthly (24
`Nucleic Acids Research
`Subscriptions are entered on a calendar year basis only and are available as a
`printed version (including online access) or as online access only at a discount of
`l Oo/c (+VAT in UK). Prices include postage by surface mail or, for subscribers in
`the USA and Canada by air freight. or in India, Japan, Australia and New Zealand.
`by Air Speeded Post. Airmail rates are available on request.
`Annual subscription rate (Volume 31, 2003):
`Institutional
`Print and Online site licence: UK and Europe £1365, Rest of World $2360.
`Personal
`Print and Online: UK and Europe £365. Rest of World $621.
`Back volume prices are available on request. Please add sales tax to the prices
`quoted.
`Orders. Orders and payments from, or on behalf of. subscribers in the various
`geographical areas shown below should be sent to the Press office indicated.
`The Americas: Oxford University Press Inc., 2001 Evans Road. Cary. NC 27513.
`USA.
`Rest of the World: Journals Subscriptions Department, Oxford University Press,
`Great Clarendon Street, Oxford OX2 6DP, UK.
`Tel: (+441865 or 01865) 353907: Fax: (+441865) 353485:
`Email: jnl.orders@oup.co.uk
`Advertising. To advertise in Nucleic Acids Research contact Oxford University
`Press (US office) in the Americas or Oxford University Press (UK office) in the
`Rest of the World (see addresses above).
`
`© Oxford University Press 2003. All rights reserved: no part of this publication
`may be reproduced. stored in a retrieval system. or transmitted in any form or by
`any means, electronic. mechanical. photocopying, recording, or otherwise without
`either prior written permission of the Publishers. or a licence permitting restricted
`copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham
`Court Road, London WIP 9HE, or in the USA by the Copyright Clearance Center.
`222 Rosewood Drive. Danvers. MA 01923. USA.
`
`Back volumes of this journal are available in 16 mm microfilm, 35 mm microfilm
`and 105 microfiche from University Microfilms International, 300 North Zeeb
`Road. Ann Arbor. MI 48106- 1346. USA. Copies of articles published are also
`available from UMI.
`
`Nucleic Acids Research (ISSN 0305-1048) is published twice monthly by Oxford
`University Press, Oxford. UK. Annual subscription price is US$2360. Nucleic
`Acids Research is distributed by Mercury International, 365 Blair Road, AveneL
`NJ 07001. USA. Periodicals Postage Paid at Rahway, NJ. USA and additional
`entry points.
`
`US POSTMASTER: send address corrections to Nucleic Acids Research, c/o
`Mercury Airfreight International Ltd. 365 Blair Road. A venel, NJ 07001, USA.
`
`Typeset and printed by Information Press Ltd, Oxford, UK on acid-free paper.
`
`Cover: 3D-Jury model of the C-terminal RNA methyltransferase module of the large ORF lab protein of the coronavirus isolated
`from patients suffering from SARS (Severe Acute Respiratory Syndrome). This kind of structure assignment is currently not
`fca-.;ihle using standard sequence comparison methods.
`
`Miltenyi Ex. 1023 Page 2
`
`

`

`Nucleic Acids Research
`Contents
`
`Volume 31 number 13, July 1, 2003
`
`EDITORIAL
`
`Detection of reliable and unexpected protein fold
`predictions using 3D-Jury
`
`K.Ginalski and L.Rychlewski
`
`DSSPcont: continuous secondary structure
`assignments for proteins
`
`P.Carter, C.A.F.Andersen and B.Rost
`
`PROTINFO: secondary and tertiary protein
`structure prediction
`
`L.-H.Hung and R.Samudrala
`
`The PredictProtein server
`
`B.Rost and J.Liu
`
`GeneSilico protein structure prediction meta-server
`
`M.A.Kurowski and J.M.Bujnicki
`
`META-PP: single interface to crucial prediction
`servers
`
`VA.Eyrich and B.Rost
`
`EVA: evaluation of protein structure prediction
`servers
`
`I.Y.Y.Koh, VA.Eyrich, M.A.Marti-Renom, D.Przybylski,
`M.S.Madhusudhan, N.Eswar, O.Grafia, F.Pazos, A.Valencia,
`A.Sali and B.Rost
`
`VADAR: a web server for quantitative evaluation of
`protein structure quality
`
`L.Willard, A.Ranjan, H.Zhang, H.Monzavi, R.F.Boyko,
`B.D.Sykes and D.S.Wishart
`
`ESPript/ENDscript: extracting and rendering
`sequence and 3D information from atomic
`structures of proteins
`
`WebFEATURE: an interactive web tool for
`identifying and visualizing functional sites on
`macromolecular structures
`
`P.Gouet, X.Robert and E.Courcelle
`
`M.P.Liang, D.R.Banatao, TE.Klein, D.L.Brutlag and
`R.B.Altman
`
`3MATRIX and 3MonF: a protein structure
`visualization system for conserved sequence motifs
`
`S.P.Bennett, L.Lu and D.L.Brutlag
`
`Motif3D: relating protein sequence motifs to 3D
`structure
`
`A.Gaulton and T.K.Attwood
`
`LOC3D: annotate sub-cellular localization for
`protein structures
`
`R.Nair and B.Rost
`
`Annotation in three dimensions. PINTS: Patterns in
`Non-homologous Tertiary Structures
`
`A.Stark and R.B.Russell
`
`NCI: a server to identify non-canonical interactions
`in protein structures
`
`M.M.Babu
`
`MolSurfer: a macromolecular interface navigator
`
`R.R.Gabdoulline, R.C.Wade and D.Walther
`
`CASTp: Computed Atlas of Surface Topography of
`proteins
`
`TA.Binkowski, S.Naghibzadeh and J.Liang
`
`3289
`
`3291-3292
`
`3293-3295
`
`3296-3299
`
`3300-3304
`
`3305-3307
`
`3308-3310
`
`3311-3315
`
`3316-3319
`
`3320-3323
`
`3324-3327
`
`3328-3332
`
`3333-3336
`
`3337-3340
`
`3341-3344
`
`3345-3348
`
`3349-3351
`
`Conrinued
`
`Miltenyi Ex. 1023 Page 3
`
`

`

`Contents (Continued)
`
`SEM (Symmetry Equivalent Molecules): a web(cid:173)
`based GUI to generate and visualize the
`macromolecules
`
`Servers for sequence-structure relationship analysis
`and prediction
`
`Z.Dosztanyi, C.Magyar, G.E.Tusnady, M.Cserzo, A.Fiser
`and I.Simon
`
`POPS: a fast algorithm for solvent accessible surface
`areas at atomic and residue level
`
`L.Cavallo, J.Kleinjung and F.Fratemali
`
`MATRAS: a program for protein 3D structure
`comparison
`
`LGA: a method for finding 3D similarities in
`protein structures
`
`Tools for comparative protein structure modeling
`and analysis
`
`SWISS-MODEL: an automated protein homology(cid:173)
`modeling server
`
`STING Millennium: a web-based suite of programs
`for comprehensive and simultaneous analysis of
`protein structure and sequence
`
`Integrated databanks access and sequence/structure
`analysis services at the PBIL
`
`NRSAS: Nuclear Receptor Structure Analysis
`Servers
`
`SSEP: secondary structural elements of proteins
`
`T.Kawabata
`
`A.Zemla
`
`N.Eswar, B.John, N.Mirkovic, A.Fiser, VA.Ilyin, U.Pieper,
`A.C.Stuart, M.A.Marti-Renom, M.S.Madhusudhan,
`B.Yerkovich and A.Sali
`
`T.Schwede, I.Kopp, N.Guex and M.C.Peitsch
`
`G.Neshich, R.C.Togawa, A.L.Mancini, PR.Kuser,
`M.E.B.Yamagishi, G.Pappas Jr, W.VTorres, T.F.eCampos,
`LL.Ferreira, FM.Luna, A.G.Oliveira, R.T.Miura,
`M.K.Inoue, LG.Horita, D.F.de Souza, F.Dominiquini,
`A.Alvaro, CS.Lima, F.O.Ogawa, G.B.Gomes, J.F.Palandrani,
`G.F.dos Santos, E.M.de Freitas, A.R.Mattiuz, LC.Costa,
`CL.de Almeida, S.Souza, C.Baudet and R.H.Higa
`
`G.Perriere, C.Combet, S.Penel, C.Blanchet, J.Thioulouse,
`C.Geourjon, J.Grassot, C.Charavay, M.Gouy, L.Duret and
`G.Deleage
`
`E.Bettler, R.Krause, F.Hom and G.Vriend
`
`VShanthi, P.Selvarani, Ch.K.Kumar, C.S.Mohire and
`K.Sekar
`
`S Mfold web server for nucleic acid folding and
`hybridization prediction
`
`M.Zuker
`
`RNAsoft: a suite of RNA secondary structure
`prediction and design software tools
`
`M.Andronescu, R.Aguirre-Hemandez, A.Condon and
`RH.Hoos
`
`S
`
`Pfold: RNA secondary structure prediction using
`stochastic context-free grammars
`
`B.Knudsen and I.Hein
`
`Vienna RNA secondary structure server
`
`PsEUooVIEwER2: visualization of RNA pseudoknots
`of any type
`
`LL.Hofacker
`
`K.Han and Y.Byun
`
`A software tool-box for analysis of regulatory RNA
`elements
`
`P.Bengert and T.Dandekar
`
`GPRM: a genetic programming approach to finding
`common RNA secondary structure elements
`
`Y.-J.Hu
`
`Volume 31 number 13, July 1, 2003
`
`A.S.Z.Hussain, Ch.K.Kumar, C.K.Rajesh, S.S.Sheik and
`K.Sekar
`
`3356-3358
`
`3359-3363
`
`3364-3366
`
`3367-3369
`
`3370-3374
`
`3375-3380
`
`3381-3385
`
`3386-3392
`
`3393-3399
`
`3400-3403
`
`3404-3405
`
`3406-3415
`
`3416-3422
`
`3423-3428
`
`3429-3431
`
`3432-3440
`
`3441-3445
`
`3446-3449
`
`Continued
`
`Miltenyi Ex. 1023 Page 4
`
`

`

`Contents (Continued)
`
`Volume 31 number 13, July 1, 2003
`
`Tools for the automatic identification and
`classification of RNA base pairs
`
`H.Yang, F.Jossinet, N.Leontis, L.Chen, I.Westbrook,
`H.Berman and E.Westhof
`
`GEPAS: a web-based resource for microarray gene
`expression data analysis
`
`I.Herrero, F.Al-Shahrour, R.Diaz-Uriarte, A.Mateos,
`J.M.Vaquerizas, I.Santoyo and I.Dopazo
`
`INCLUSive: a web portal and service registry for
`microarray and regulatory sequence analysis
`
`B.Coessens, G.Thijs, S.Aerts, K.Marchal, F.De Smet,
`K.Engelen, P.Glenisson, Y.Moreau, I.Mathys and B.De Moor
`
`3450-3460
`
`3461-3467
`
`3468-3470
`
`GenePublisher: automated analysis of DNA
`microarray data
`
`S.Knudsen, C.Workman, T.Sicheritz-Ponten and C.Friis
`
`3471-3476
`
`Express Yourself: a modular platform for processing
`and visualizing microarray data
`
`N.M.Luscombe, TE.Royce, P.Bertone, N.Echols, CE.Horak,
`J.T.Chang, M.Snyder and M.Gerstein
`
`Chiplnfo: software for extracting gene annotation
`and gene ontology information for microarray
`analysis
`
`REDUCE: an online tool for inferring cis-regulatory
`elements and transcriptional module activities from
`microarray data
`
`Design of oligonucleotides for microarrays and
`perspectives for design of multi-transcriptome
`arrays
`
`S.Zhong, C.Li and WH. Wong
`
`C.Roven and H.J.Bussemaker
`
`3477-3482
`
`3483-3486
`
`3487-3490
`
`H.B.Nielsen, R.Wemersson and S.Knudsen
`
`3491-3496
`
`Multiple sequence alignment with the Clustal series
`of programs
`
`R.Chenna, H.Sugawara, T.Koike, R.Lopez, T.J.Gibson,
`D.G.Higgins and J.D.Thompson
`
`CLOURE: Clustal Output Reformatter, a program
`for reformatting ClustalX/ClustalW outputs for SNP
`analysis and molecular systematics
`
`D.K.Kohli and A.K.Bachhawat
`
`Tcoffee@igs: a web server for computing, evaluating
`and combining multiple sequence alignments
`
`O.Poirot, E.O'Toole and C.Notredame
`
`SLAM web server for comparative gene finding and
`alignment
`
`S.Cawley, L.Pachter and M.Alexandersson
`
`Theatre: a software tool for detailed comparative
`analysis and visualization of genomic sequence
`
`Y.J.K.Edwards, T.J.Carver, T.Vavouri, M.Frith, M.J.Bishop
`and G.Elgar
`
`MultiPipMaker and supporting tools: alignments
`and analysis of multiple genomic DNA sequences
`
`S.Schwartz, L.Elnitski, M.Li, M.Weirauch, C.Riemer,
`A.Smit, NISC Comparative Sequencing Program,
`E.D.Green, R.C.Hardison and WMiller
`
`MAVID multiple alignment server
`
`N.Bray and L.Pachter
`
`3497-3500
`
`3501-3502
`
`3503-3506
`
`3507-3509
`
`3510-3517
`
`3518-3524
`
`3525-3526
`
`EnteriX 2003: visualization tools for genome
`alignments of Enterobacteriaceae
`
`L.Florea, M.McClelland, C.Riemer, S.Schwartz and WMiller
`
`3527-3532
`
`MGAlignlt: a web service for the alignment of
`mRNA/EST and genomic sequences
`
`B.T.K.Lee, T.WTan and S.Ranganathan
`
`RevTrans: multiple alignment of coding DNA from
`aligned amino acid sequences
`
`R.Wemersson and AG.Pedersen
`
`S
`
`PromH: promoters identification using orthologous
`genomic sequences
`
`V.V.Solovyev and I.A.Shahmuradov
`
`3533-3536
`
`3537-3539
`
`3540-3545
`
`Continued
`
`Miltenyi Ex. 1023 Page 5
`
`

`

`Contents (Continued)
`
`Volume 31 number 13, July 1, 2003
`
`S
`
`FIE2: a program for the extraction of genomic DNA
`sequences around the start and translation initiation
`site of human genes
`
`A.Chong, G.Zhang and VB.Bajic
`
`PromoSer: a large-scale mammalian promoter and
`transcription start site identification service
`
`A.S.Halees, D.Leyfer and Z.Weng
`
`S Dragon Gene Start Finder identifies approximate
`locations of the 5' ends of genes
`
`VB.Bajic and S.H.Seah
`
`ETOPE: evolutionary test of predicted exons
`
`A.Nekrutenko, W-Y.Chung and WH.Li
`
`ESEfinder: a web resource to identify exonic
`splicing enhancers
`
`SiteSeer: visualisation and analysis of transcription
`factor binding sites in nucleotide sequences
`
`L.Cartegni, I.Wang, Z.Zhu, M.Q.Zhang and A.R.Krainer
`
`PE.Boardman, S.G.Oliver and SJ.Hubbard
`
`MATCH™: a tool for searching transcription factor
`binding sites in DNA sequences
`
`A.Kel, E.G6J3ling, I.Reuter, E.Cheremushkin,
`O.Kel-Margoulis and E.Wingender
`
`Gibbs Recursive Sampler: finding transcription
`factor binding sites
`
`YMF: a program for discovery of novel
`transcription factor binding sites by statistical
`overrepresentation
`
`Target Explorer: an automated tool for the
`identification of new target genes for a specified set
`of transcription factors
`
`WThompson, E.C.Rouchka and C.E.Lawrence
`
`S.Sinha and M.Tompa
`
`A.Sosinsky, C.P.Bonin, RS.Mann and B.Honig
`
`Regulatory Sequence Analysis Tools
`
`J. van Helden
`
`GeneSeqer@PlantGDB: gene structure prediction in
`plant genomes
`
`GlimmerM, Exonomy and Unveil: three ab initio
`eukaryotic genefinders
`
`S Dragon ERE Finder version 2: a tool for accurate
`detection and analysis of estrogen response elements
`in vertebrate genomes
`
`PatSearch: a program for the detection of patterns
`and structural motifs in nucleotide sequences
`
`PSORT-B: improving protein subcellular localization
`prediction for Gram-negative bacteria
`
`Signal search analysis server
`
`MHCPred: a server for quantitative prediction of
`peptide-MHC binding
`
`S.D.Schlueter, Q.Dong and VBrendel
`
`WH.Majoros, M.Pertea, C.Antonescu and S.L.Salzberg
`
`VB.Bajic, S.L.Tan, A.Chong, S.Tang, A.Strom,
`J.-A.Gustafsson, C.-Y.Lin and E.T.Liu
`
`G.Grillo, F.Licciulli, S.Liuni, E.Sbisa and G.Pesole
`
`J.L.Gardy, C.Spencer, K.Wang, M.Ester, G.E.Tusnady,
`I.Simon, S.Hua, K.deFays, C.Lambert, K.Nakai and
`F.S.L.Brinkman
`
`G.Ambrosini, VPraz, VJagannathan and P.Bucher
`
`P.Guan, I.A.Doytchinova, C.Zygouri and D.R.Flower
`
`3546-3553
`
`3554-3559
`
`3560-3563
`
`3564-3567
`
`3568-3571
`
`3572-3575
`
`3576-3579
`
`3580-3585
`
`3586-3588
`
`3589-3592
`
`3593-3596
`
`3597-3600
`
`3601-3604
`
`3605-3607
`
`3608-3612
`
`3613-3617
`
`3618-3620
`
`3621-3624
`
`Continued
`
`Miltenyi Ex. 1023 Page 6
`
`

`

`Contents (Continued)
`
`ELM server: a new resource for investigating short
`functional sites in modular eukaryotic proteins
`
`Volume 31 number 13, July 1, 2003
`
`P.Puntervoll, R.Linding, C.Gemiind, S.Chabanis-Davidson,
`M.Mattingsdal, S.Cameron, D.M.A.Martin, G.Ausiello,
`B.Brannetti, A.Costantini, F.Ferre, V.Maselli, A.Via,
`G.Cesareni, F.Diella, G.Superti-Furga, LWyrwicz, C.Ramu,
`C.McGuigan, R.Gudavalli, I.Letunic, P.Bork, LRychlewski,
`B.Kiister, M.Helmer-Citterich, WN.Hunter, R.Aasland and
`I.I.Gibson
`
`3625-3630
`
`Prediction of lipid posttranslational modifications
`and localization signals from protein sequences:
`big-II, NMT and PTSl
`
`F.Eisenhaber, B.Eisenhaber, WKubina, S.Maurer-Stroh,
`G.Neuberger, G.Schneider and M.Wildpaner
`
`3631-3634
`
`Scansite 2.0: proteome-wide prediction of cell
`signaling interactions using short sequence motifs
`
`J.C.Obenauer, LC.Cantley and M.B.Yaffe
`
`Static benchmarking of membrane helix predictions
`
`A.Kemytsky and B.Rost
`
`The web server of IBM's Bioinformatics and Pattern
`Discovery group
`
`I.Huynh, I.Rigoutsos, LParida, D.Platt and I.Shibuya
`
`Identification of patterns in biological sequences at
`the ALGGEN server: PROMO and MALGEN
`
`D.Farre, R.Roset, M.Huerta, J.E.Adsuara, LRosell6,
`M.M.Alba and X.Messeguer
`
`SEARCHPKS: a program for detection and analysis
`of polyketide synthase domains
`
`G.Yadav, R.S.Gokhale and D.Mohanty
`
`S MAK, a computational tool kit for automated
`MITE analysis
`
`G.Yang and TC.Hall
`
`Cluster-Buster: finding dense clusters of motifs in
`DNA sequences
`
`M.C.Frith, M.C.Li and Z.Weng
`
`SIC: a tool to detect short inverted segments in a
`biological sequence
`
`D.Robelin, H.Richard and B.Prum
`
`mreps: efficient and flexible detection of tandem
`repeats in DNA
`
`R.Kolpakov, G.Bana and G.Kucherov
`
`SPA: simple web tool to assess statistical significance
`of DNA patterns
`
`H.Richard and G.Nuel
`
`TRACTS: a program to map
`oligopurine.oligopyrimidine and other binary DNA
`tracts
`
`M.Gal, T.Katz, A.Ovadia and G.Yagil
`
`DNA analysis servers: plot.it, bend.it, model.it and
`IS
`
`K.Vlahovicek, LKajan and S.Pongor
`
`NEBcutter: a program to cleave DNA with
`restriction enzymes
`
`T. Vincze, J.Posfai and R.J.Roberts
`
`3635-3641
`
`3642-3644
`
`3645-3650
`
`3651-3653
`
`3654-3658
`
`3659-3665
`
`3666-3668
`
`3669-3671
`
`3672-3678
`
`3679-3681
`
`3682-3685
`
`3686-3687
`
`3688-3691
`
`SVM-Prot: web-based support vector machine
`software for functional classification of a protein
`from its primary sequence
`
`BPROMPT: a consensus server for membrane
`protein prediction
`
`GlobPlot: exploring protein sequences for
`globularity and disorder
`
`C.Z.Cai, LY.Han, Z.LJi, X.Chen and Y.Z.Chen
`
`3692-3697
`
`PD.Taylor, T.K.Attwood and D.R.Flower
`
`3698-3700
`
`R.Linding, RB.Russell, V.Neduva and I.I.Gibson
`
`3701-3708
`
`Continued
`
`Miltenyi Ex. 1023 Page 7
`
`

`

`Contents (Continued)
`
`Volume 31 number 13, July 1, 2003
`
`iSPOT: a web tool to infer the interaction specificity
`of families of protein modules
`
`B.Brannetti and M.Helrner-Citterich
`
`Automated Gene Ontology annotation for
`anonymous sequence data
`
`S.Hennig, D.Groth and H.Lehrach
`
`ESTAnnotator: a tool for high throughput EST
`annotation
`
`A.Hotz-Wagenblatt, T.Hankeln, P.Emst, K.-H.Glatting,
`E.R.Schrnidt and S.Suhai
`
`3709-3711
`
`3712-3715
`
`3716-3719
`
`Phydbac (phylogenomic display of bacterial genes):
`an interactive resource for the annotation of
`bacterial genomes
`
`F.Enault, K.Suhre, O.Poirot, C.Abergel and J.-M.Claverie
`
`3720-3722
`
`AMIGene: Annotation of Microbial Genes
`
`S.Bocs, S.Cruveiller, D.Vallenet, G.Nuel and C.Medigue
`
`AHMII: Agent to Help Microbial Information
`Integration
`
`H.Sugawara and S.Miyazaki
`
`S DNannotator: annotation software tool kit for
`regional genomic sequences
`
`C.Liu, TI.Bonner, I.Nguyen, J.L.Lyons, S.L.Christian and
`E.S.Gershon
`
`Bioverse: functional, structural and contextual
`annotation of proteins and proteomes
`
`I.McDermott and R. Sarnudrala
`
`S
`
`S
`
`FrameD: a flexible program for quality check and
`gene prediction in prokaryotic genomes and noisy
`matured eukaryotic sequences
`
`EuGhE'HoM: a generic similarity-based gene finder
`using multiple homologous sequences
`
`PROBEmer: a web-based software tool for selecting
`optimal DNA oligos
`
`Primer Design Assistant (PDA): a web-based primer
`design tool
`
`DePIE: Designing Primers for Protein Interaction
`Experiments
`
`OligoDesign: optimal design of LNA (locked nucleic
`acid) oligonucleotide capture probes for gene
`expression profiling
`
`CODEHOP (COnsensus-DEgenerate Hybrid
`Oligonucleotide Primer) PCR primer design
`
`T.Schiex, I.Gouzy, A.Moisan and Y.de Oliveira
`
`S.Foissac, P.Bardou, A.Moisan, M.-J.Cros and T.Schiex
`
`SJ.Emrich, M.Lowe and A.L.Delcher
`
`S.H.Chen, C.Y.Lin, C.S.Cho, C.Z.Lo and C.A.Hsiung
`
`G.Lu, M.Hallett, S.Pollock and D.Thornas
`
`N.Tolstrup, PS.Nielsen, JG.Kolberg, AM.Frankel,
`H.Vissing and S.Kauppinen
`
`TM.Rose, J.G.Henikoff and S.Henikoff
`
`S RNA-related tools on the Bielefeld Bioinformatics
`Server
`
`A.Sczyrba, J.Kriiger, H.Mersch, S.Kurtz and R.Giegerich
`
`SIRW: a web server for the Simple Indexing and
`Retrieval System that combines sequence motif
`searches with keyword searches
`
`C.Rarnu
`
`Onto-Tools, the toolkit of the modern biologist:
`Onto-Express, Onto-Compare, Onto-Design and
`Onto-Translate
`
`Swiss EMBnet node web server
`
`S.Draghici, P.Khatri, P.Bhavsar, A.Shah, S.A.Krawetz and
`M.A.Tainsky
`
`L.Falquet, L.Bordoli, V.Ioannidis, M.Pagni and
`C. V.Jongeneel
`
`3723-3726
`
`3727-3728
`
`3729-3735
`
`3736-3737
`
`3738-3741
`
`3742-3745
`
`3746-3750
`
`3751-3754
`
`3755-3757
`
`3758-3762
`
`3763-3766
`
`3767-3770
`
`3771-3774
`
`3775-3781
`
`3782-3783
`
`Continued
`
`Miltenyi Ex. 1023 Page 8
`
`

`

`Contents (Continued)
`
`Volume 31 number 13, July 1, 2003
`
`ExPASy: the proteomics server for in-depth protein
`knowledge and analysis
`
`E.Gasteiger, A.Gattiker, C.Hoogland, I.Ivanyi, RD.Appel
`and A.Bairoch
`
`UniqueProt: creating representative protein
`sequence sets
`
`S.Mika and B.Rost
`
`3784-3788
`
`3789-3791
`
`BLAST2SRS, a web server for flexible retrieval of
`related protein sequences in the SWISS-PROT and
`SPTrEMBL databases
`
`WU-Blast2 server at the European Bioinformatics
`Institute
`
`K.Bimpikis, A.Budd, R.Linding and T.J.Gibson
`
`3792-3794
`
`R.Lopez, V.Silventoinen, S.Robinson, A.Kibria and WGish
`
`3795-3798
`
`OntoBlast function: from sequence similarities
`directly to potential functional annotations by
`ontology terms
`
`G.Zehetner
`
`ORFeus: detection of distant homology using
`sequence profiles and predicted secondary structure
`
`K.Ginalski, I.Pas, L.S.Wyrwicz, M.von Grotthuss,
`J.M.Bujnicki and L.Rychlewski
`
`PARSESNP: a tool for the analysis of nucleotide
`polymorphisms
`
`N.E.Taylor and E.A.Greene
`
`SIFT: predicting amino acid changes that affect
`protein function
`
`P.C.Ng and S.Henikoff
`
`S ActionMap: a web-based software that automates
`loci assignments to framework maps
`
`G.Albini, M.Falque and J.Joets
`
`WEB-THERMODYN: sequence analysis software
`for profiling DNA helical stability
`
`Y.Huang and D.Kowalski
`
`NEWT, a new taxonomy portal
`
`I.Q.H.Phan, S.F.Pilbout, WFleischmann and A.Bairoch
`
`Comprehensive quantitative analyses of the effects
`of promoter sequence elements on mRNA
`transcription
`
`PipeAlign: a new toolkit for protein family analysis
`
`M.Lapidot and Y.Pilpel
`
`F.Plewniak, L.Bianchetti, Y.Brelivet, A.Carles, F.Chalmel,
`O.Lecompte, T.Mochel, L.Moulinier, A.Muller, I.Muller,
`V.Prigent, R.Ripp, J.-C.Thierry, J.D.Thompson, N.Wicker
`and O.Poch
`
`NORSp: predictions of long regions without regular
`secondary structure
`
`I.Liu and B.Rost
`
`Biological SOAP servers and web services provided
`by the public sequence data bank
`
`H.Sugawara and S.Miyazaki
`
`FootPrinter: a program designed for phylogenetic
`footprinting
`
`M.Blanchette and M. Tompa
`
`GeneFizz: a web tool to compare genetic ( coding/
`non-coding) and physical (helix/coil) segmentations
`of DNA sequences. Gene discovery and evolutionary
`perspectives
`
`E. Yeramian and L.Jones
`
`3799-3803
`
`3804-3807
`
`3808-3811
`
`3812-3814
`
`3815-3818
`
`3819-3821
`
`3822-3823
`
`3824-3828
`
`3829-3832
`
`3833-3835
`
`3836-3839
`
`3840-3842
`
`3843-3849
`
`S Geno2pheno: estimating phenotypic drug resistance
`from HIV-1 genotypes
`
`N.Beerenwinkel, M.Daumer, M.Oette, K.Kom, D.Hoffmann,
`R.Kaiser, T.Lengauer, I.Selbig and H.Walter
`
`3850-3855
`
`Continued
`
`Miltenyi Ex. 1023 Page 9
`
`

`

`Contents (Continued)
`
`Volume 31 number 13, July 1, 2003
`
`Building protein diagrams on the web with the
`residue-based diagram editor RbDe
`
`L.Skrabanek, F.Campagne and H.Weinstein
`
`CRP: Cleavage of Radiolabeled Phosphoproteins
`
`A.I.Mackey, T.A.J.Haystead and WR.Pearson
`
`S NirGel: calculation of virtual two-dimensional
`protein gels
`
`Update on XplorMed: a web server for exploring
`scientific literature
`
`AUTHOR INDEX
`
`S, Supplementary Matenal available at NAR Online
`
`K.Hiller, M.Schobert, C.Hundertmark, D.Jahn and R.Milnch
`
`C.Perez-Iratxeta, A.I.Perez, P.Bork and M.A.Andrade
`
`3856-3858
`
`3859-3861
`
`3862-3865
`
`3866-3868
`
`Miltenyi Ex. 1023 Page 10
`
`

`

`Nucleic Acids Research, 2003, Vol. 31, No. 13
`3812-3814
`DOI: 10.1093/narlgkg509
`
`SIFT: predicting amino acid changes that affect
`protein function
`Pauline C. Ng and Steven Henikoff*
`
`Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N A1-162, Seattle, WA 98109, USA
`
`Received January 4, 2003; Revised and Accepted February 28, 2003
`
`ABSTRACT
`Single nucleotide polymorphism (SNP} studies and
`random mutagenesis projects identify amino acid
`substitutions in protein-coding regions. Each sub(cid:173)
`stitution has the potential to affect protein function.
`SIFT (~orting !ntolerant from !olerant} is a program
`that predicts whether an amino acid substitution
`affects protein function so that users can prioritize
`substitutions for further study. We have shown that
`SIFT can distinguish between functionally neutral
`and deleterious amino acid changes in mutagenesis
`studies and on human polymorphisms. SIFT is
`avai I able at http://blocks.fhcrc.org/sift/SI FT .html.
`
`INTRODUCTION
`Single nucleotide polymorphisms (SNPs) are used as markers
`in linkage and association studies to detect which regions in
`the human genome may be involved in disease. SNPs in
`coding and regulatory regions may be implicated in disease
`themselves. Non-synonymous SNPs that lead to an amino acid
`change in the protein product are of major interest, because
`amino acid substitutions currently account for approximately
`half of the known gene lesions responsible for human inherited
`disease (1 ). SIFT (Sorting Intolerant From Tolerant) uses
`sequence homology- to pre-diet whether an- amino acid
`substitution will affect protein function and hence, potentially
`alter phenotype (2,3 ).
`SIFT has been applied to human variant databases and was
`able to distinguish mutations involved in disease from neutral
`polymorphisms (3). Assuming that disease-causing amino acid
`substitutions are damaging to protein function, we applied SIFT
`to a database of missense substitutions associated with or
`involved in disease (4). SIFT predicted 69% to be damaging.
`When SIFT was applied to the non-synonymous SNPs in
`dbSNP ( 5), a database of putative SNPs, 25% of the variants
`were predicted to be deleterious. This was similar to SIFT's
`20% false positive error which suggested that most non(cid:173)
`synonymous SNPs are functionally neutral. Furthermore, a
`subset of the variants from dbSNP predicted to affect function
`were involved in disease which confirmed SIFT sensitivity.
`The SIFT algorithm relies solely on sequence for prediction,
`yet performs similarly .to tools that use structure (3,~8). An
`
`advantage of not requiring structure is that a larger number of
`substitutions can be predicted on. Of the non-synonymous
`SNPs identified by the SNP Consortium, 74% were sufficiently
`similar to homologs in protein sequence databases for SIFT
`prediction. The number of substitutions that SIFT can predict
`on is expected to increase as more genomes are sequenced and
`more protein sequences become available.
`
`SIFT PREDICTION METHOD
`SIFT presumes that important amino acids will be conserved
`in the protein family, and so changes at well-conserved
`positions tend to be predicted as deleterious. For example, if a
`position in an alignment of a protein family only contains the
`amino acid isoleucine, it is presumed that substitution to any
`other amino acid is selected against and that isoleucine is
`necessary for protein function. Therefore, a change to any
`other amino acid will be predicted to be deleterious to protein
`the
`in an alignment contains
`If a position
`function.
`hydrophobic amino acids isoleucine, valine and leucine, then
`SIFT assumes, in effect, that this position can only contain
`amino acids with hydrophobic character. At this position,
`to other hydrophobic amino acids are usually
`changes
`predicted to be tolerated but changes to other residues (such
`as charged or polar) will be predicted to affect protein
`function.
`To predict whether an amino acid substitution in a protein
`will affect protein function, SIFT considers the position at
`which the change occurred and the type of amino acid
`change. Given a protein sequence, SIFT chooses related
`proteins and obtains an alignment of these proteins with the
`query. Based on the amino acids appearing at each position
`in the alignment, SIFT calculates the probability that an
`amino acid at a position is tolerated conditional on the most
`frequent amino acid being tolerated. If this normalized value
`is less than a cutoff, the substitution is predicted to be
`deleterious (2). The SIFT algorithm and software have been
`described previously (2,3).
`
`SIFT WEBSITE
`
`Input
`Users can obtain predictions for amino acid changes of interest
`at http://www.blocks.fhcrc.org/sift/SIFT.html. From this page,
`
`*To whom correspondence should be addressed. Tel: + 1 2066674515; Fax: + 1 2066675889; Email: steveh@fhcrc.org-
`
`.\'ucleic Acids Research, Vol. 31, No. 13 £: Oxford University Press 2003; all rights reserved
`
`Miltenyi Ex. 1023 Page 11
`
`

`

`Nucleic Acids Research, 2003, Vol. 31, No. 13
`
`3813
`
`Sul:)stitution at pos 1426 f:co:m. S to P is pr:edicted to AFFECT PROTEIN FUl•ICTIOI•I with a scoz:e of o. 02.
`Median sequence conse:cvation: 2.90
`Sequences :cep:cesented at this position:26
`
`Substitution at pos 1432 f:com. H to K is pr:edicted to be TOLERATED with a scor:e of 0.08.
`Median sequence consez:vation: 2.90
`Sequences r:ep:cesented at this position:26
`
`Substitution at pos 1445 frnm. D to N is predicted to AFFECT PROTEIN FlJNCTION with a scor:e of 0.01.
`Median sequence conser:vation: 3.66
`Sequences r:ep:r:esented at this position:21
`T1TARJHNG 1 ; This sul:)sti tution m.ay have been pz:edicted to affect function just because
`the sequences used wer:e not diver:se enough. There is LOW COHFIDENCE ia this prediction.
`
`Figure 1. An example of SIFT prediction on amino acid changes in a protein. Substitutions with score less than 0.05 are predicted to affect protein function. In the
`last prediction, the median conservation of the sequences does not meet the threshold so a warning is issued.
`
`there are links to three submission pages which allow users
`different levels of involvement in order to control the quality of
`their predictions.
`For minimal involvement, users can simply submit their
`protein sequences and amino acid substitutions. In its fully
`automated mode, SIFT will search for protein sequences
`these
`the query protein and based on
`to
`homologous
`sequences, calculate probabilities for each possible amino acid
`change. Users can select from among SWISS-PROT, SWISS(cid:173)
`PROT/TrEMBL, or NCBI's non-redundant protein databases
`for SIFT to search ( 4,9).
`Although SIFT can choose sequences automatically, better
`prediction results may be obtained when all of the sequences
`that are provided are orthologous to the query protein. This is
`because inclusion of paralogous sequences confounds predic(cid:173)
`tion at residues conserved only among the orthologues. If a
`user already has sequences that are thought to be functionally
`similar to the protein of interest, these sequences can be
`directly submitted and SIFT's step for choosing sequences
`skipped. Given the query protein and homologous sequences,
`SIFT obtains the alignment.
`If regions are misaligned, SIFT will not recognize conserved
`positions and therefore miss potentially damaging substitu(cid:173)
`tions. For best prediction quality, a third mode of operation
`allows users to submit their own alignments.
`
`Output
`Predictions are given for all 20 possible amino acid changes at
`each position in the protein. The alignment is also returned so
`that users can examine the sequences used for prediction and
`modify them for resubmission. This option is also useful for
`removing uncertain, erroneous and misaligned sequences from
`alignment output generated by SIFT in its automatic mode.
`For amino acid substitutions submitted by the user, a more
`detailed synopsis is provided (Fig. 1 ). The score is the
`normalized probability that the amino acid change is tolerated.
`SIFT predicts substitutions with scores less than 0.05 as
`deleterious. Some SIFT users have found that substitutions
`with scores less than 0.1 provide better sensitivity for detecting
`deleterious SNPs (Cornelia Ulrich, personal communication
`
`and 10). The quantitative score allows users to prioritize their
`amino acid changes by ranking them from the lowest scores to
`the highest.
`Confidence in a substitution predicted to be deleterious
`depends on the diversity of the sequences in the alignment. If
`the sequences used for prediction are closely related, then
`many positions will appear conserved and SIFT will predict
`most substitutions to affect protein function. This leads to a
`high false positive error where functionally neutral substitu(cid:173)
`tions are predicted to be deleterious.
`To alert the user to these situations, SIFT calculates the
`median conservation value which measures the diversity of the
`sequences in the alignment. Conservation, as measured by
`information content ( 11 ), is calculated for each position in the
`is obtained.
`alignment and the median of these values
`Conservation ranges from log220 ( = 4.32), when a position
`is completely conserved and only one amino acid is observed,
`to zero, when all 20 amino acids are observed at a position. By
`default, SIFT builds alignments with a median conservation
`value of 3.0. Predictions based on sequence alignments

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket