`
`
`
`Visit Wfisg lournais mane
`-.—--\ ww irrtamclmaa.wibg.cm
`
`$1 WILEY~INTERSCIENCE
`
`l‘aQN l I I Eilvfit'rfi i
`
`
`
`
`
`Petitioner Microsoft Corporation - Ex. 1012, Cover 1
`Petitioner Microsoft Corporation - EX. 1012, Cover 1
`
`
`
`emisl;
`
`ORGANIC l INORGANIC l PHYSICAL l' BIOLOGICAL
`
`Editors:
`Editorial Advisory Board:
` Norman L. Allinger
`
`Enrico Clementi
`Paul G. Mezey
`Leo Radom
`Department of Chemistry
`Department of Chemistry
`Department of Chemistry
`Computational Center for
`Research School of Chemistry
`Universily L. Pasteur, 3
`University of Saskatchewan
`The Australian National
`Molecular Structure and
`rue de l'Universite
`Saskalnon, Canada SYN 0W0
`Design
`University
`67084 Strasbourg, France
`Canberra, ACT. 0201]
`University of Georgia
`Australia
`Athens, GA 30602—2526
`
`Warren J. Hehre
`Wavefnnction, 1840] Von
`Karman. Suite 370
`Irvine, CA 92715
`
`Keiji Moroltuma
`Department of Chemistry
`Emory Universily
`Atlanta, GA 30322
`
`Paul mm H. Schlever
`Computer Chemistry Center
`lnstitnt lilr Organische Chemie
`Universitat ErlangeirNiirnberg
`Henkestrasse 42, 13-91054
`Erlangen, Germany
`
`Associate. Editor:
`
`Gernot Frenking
`Fachhereich Cheniie
`Philipps~Universitiit Marinlrg
`H ans—Meerwein-Strasse
`D-35032 Marburg, Germany
`
`Assistant. Editors:
`
`Martin Feigel
`lnstitut fiir Organische Chemie
`Universitat Bochuin
`Universitatsstr. 150
`13-44780 Bochum, Germany
`Roger S. Grev
`Department of Chemistry
`University of Kentucky
`Lexington, KY 40506
`
`William L. Jorgensen
`Department of Chemistry
`Yale University
`PO Box 208017
`New Haven, CT 06520-8107
`
`Martin Karplus
`Department of Chemistry
`'12 Oxford Street
`
`Harvard University
`Cambridge, MA 02138
`
`Peter A. Kollman
`Department of Pharmaceutical
`Chemistry
`School of Pharmacy
`University of California
`San Francisco, CA 94143
`
`Eiji Osawa
`Department of Knowledge—
`Based Information
`Engineering
`Toyohashi University of
`Technology
`Tempaku—chn, Toyuhashi
`441, Japan
`
`John A. Pople
`Department of Chemistry
`Nortlnvestern University
`2145 Sheridan Road
`Wilmette, l L 60208
`
`Peter Pulay
`Department of Chemistry
`University of Arkansas
`Fayettevilie, AR 7270]
`
`Harold A. Scheraga
`Ba ker La horatnry of Chemistry
`Cornell University
`Ithaca, NY I-I853—130l
`
`Andrew Streitwieser, Jr.
`Department of Chemistry
`University of California
`Berkeley
`Berkeley, CA ”4120
`
`Kenneth B. Wiberg
`Department of Chemistry
`Yale University
`225 Prospect Street
`New Haven, CT ӣ5520
`
`Michael C. Zerner
`Department of Chemistry
`University of Florida
`Iainesvilie, Fl- 3201 l
`
`Editorial Production. John one; Marilee Croft Olson
`
`The Journal cl Computational Chemistry [1550: 0192-0551} is published 16 times a year. monthly and
`cessing a change at address. For subsoil
`ption inquiries. please call 212-350-6045: E-niaii:
`seml‘monlhly in January. April. July. and November. one volume per year by John Wiley 8. Sons. Inc:
`SUBlNFDGfljwllcycom
`605 Third Avenue. New York. NY. 10l58.
`Postmaster: Send address changes to Journal of Computational Chemistry. Cnrolinu Rolhuui.
`this publication
`rights reserved. No part at
`inc. All
`Copyright
`=E- 1900 John ‘Nlley i5 Sons.
`Director. Subscription Fuihilment and Dislrihuiion. John L'lilay ti Solis.
`Inc__ 605 Third Avenue.
`may be reproduced in any lorm or by any means. except as permitted under scclion 10? or
`New York. NY 10158. For subscription inquiries. please call customer
`service at 212-350-
`103 ct
`Ihe This United States Copyright Act. without either
`the prior written pcnnisslon of
`6645 or write to the above address.
`the publisher. or authorization Ihrough the Copyright Clearance Center. 222 Rosewood Drive.
`lonvardcd
`DE
`should
`advertising
`Advertising:
`Inquiries
`concerning
`Leroy.
`Susan
`lianvors. MA 01923:
`tel: 500-?50-3400.
`lax: 508-?50-rldi'0. Periodicals postage paid at
`lieu.-
`inc.
`Advenising Sales. John Wiley 8. Sons,
`Yorh. lit. and at additional mailing offices.
`ifltbd. 212-
`i-i‘l
`605 Third Avenue. New York.
`0503832. Advcrlising Sales. European Contacts:
`indicate the copyright holders consent
`The code and the copyright notice in the journal
`llob Kern or ilrclry Douglas. John they 8. Sons. Ltd.
`that copies mayr be made tor personal or internal
`Collins Lane. Chichcsier. West Sussex P019 iUD. England. Tel: «14 $243 Till SEC-36?; Fax: rid i243
`use. or for
`the personal or
`internal use cl
`FTD 432: cvmoil: adsaios@rriley.co.ult
`specific clients. on lire condilion that
`the copier
`pay tor copying beyond that purmiticd by
`haprlnts: Reprint sales and inquiries should ho dircclcd to the customer service department
`Sections
`in? and 100 of
`the United States Copyright Law. The per-copy inc lot each item is
`to be paid through the Copyright Clearance Center. inc.
`John Wiley d Sons. Inc. 605 Third Ave” llovr York. NY 10158: 1121: 212-850-0??6.
`Manuscripts should be submitted to one oi
`the editors: Dr.
`lion'nan L Alllngcr, Journal or
`such as copying for general
`copying.
`ol
`This consent does not
`cartoon in olher
`triads
`Computational Choruislry Department at Chemistry. The Llnivcrsily at Georgia. Adidas. CA
`[or creating new collective works. or
`distribution.
`for advcdising or promotional purposes.
`30502;
`Prof. Dr.
`Paul
`von
`ll. Schicycr.
`resale, Such permission requests and other permission inquiries
`should he addressed to
`[or
`liir Drganischo CheJournal of Computational Chemistry. Computer
`the Permissions Dept.
`Chemistry Center,
`lnstilul
`mic. Universiiat Erlongcn-Niimhcrg. Hanhestr.
`:12.
`0-91054. Erlangcn. Comany: or Prol. Carnot Freaking. Journal of Computational Chemistry.
`subscription price {Volume 19. 1998): 31.19000 in US. 31.35000 in Canada and diction.
`51.43.00 outside North America. $150.00 US AC3 member. 5246.00 outside us nos member, Per-
`Fachbcrcich Chornio. Phifipps-Univcrsitril Martmrg. Hans-Meenyeln-Stracse. 0—35032 thallium, Germany.
`information for couln'oulurs appears in lira first and last issue oi each volume. All other noncompliance
`sonal ralc: $150.00 in US, $246.00 outside Noah Amcn'ra. subscriptions at the personal rate are avail-
`should be addressed to Journal cl Computational Chemistry. Publisher, Interscicnce Division. Profes-
`able only In individuals. All subscriptions outside US will he sent by air. Payment musl he made in US
`sional. Holerence, and Trade Group. John Wiley o Sons inc.. 605 Third Avenue. New York. NY 10153.
`dollars drawn on a US bank. Claims tor undelivered copies Iuiil he accepted only alter the following issue
`The contents 01 this Iournnl are indexed in the Ioilovring: Chemical Abstracts. Chemical
`idles.
`has been received. Prozac enclose a copy of the mailing label. Missing copies will be supplied whcn
`Current CoplanislPhyslcal. Cherrricat 8. Earth Sciences. Mathematical Holden-s, Reference Update.
`losses have been suslalncd in transll and where reserve stock permits. Pleasc allow tour weeks [or pro—
`Research Alon (iSll. Science Citation index 050. and SClSEAflCH Database 030.
`9': This paper meets lhc requirements of AilCllNISD 230210-1092 (Pennanence of paper}.
`
`Io
`
`Petitioner Microsoft Corporation - Ex. 1012, Cover 2
`Petitioner Microsoft Corporation - Ex. 1012, Cover 2
`
`
`
`Journal of
`
`Computational
`Chemistry
`
`Volume 19lNumber flimsy 1998
`
`CONTENTS
`
`Development of a Parallel Molecular Dynamics Code on SIMD Computers:
`
`685
`
`Algorithm for Use of Fair List Criterion
`
`D. Rocmtnnn, R. Bizzarrl, G. Clnllenn', N. Senna, and A. Di Nola
`
`Multivariate Analysis of a Data Matrix Containing A—DNA and B-DNA
`
`695
`
`Dinucleoside Monophosphate Steps: Multidimensional Ramachandran
`Plots for Nucleic Acids
`
`M. L. M. Backers and L. M. C. Bnydens
`
`Ab lnilio Prediction of 15N-NMR Chemical Shift in d-Boron Nitride
`
`Based on an Analysis of Connectivities
`Marcus Gnsn‘eicn and Christel M. Marian
`
`The Flying Ice Cube: Velocity Rescaling in Molecular Dynamics
`
`Leads to Violation of Energy Equipartition
`
`Stephen C. Harvey, Robert K.-Z. Tan, and Tltonms E. Clientlnnn Ill
`
`Systematic Prediction of the Products and Intermediates of Isot0pic
`
`Labeling in Reaction Pathway Studies
`
`Andrew V. Zeignrnilc and anil E. Millie’s-Perez
`
`Electrostatic Model for Infrared Intensities in a Spectroscopically
`Determined Molecular Mechanics Force Field.
`
`Kim Palm) and Samuel Krinnn
`
`715
`
`726
`
`741
`
`754
`
`Solvation Free Energies Calculated Using the GB/ SA Model: Sensitivity
`
`769
`
`of Results on Charge Sets, Protocols, and Force Fields
`
`M. Rmni Reality, Mark D. Brion, Alnl Agnrwnl, Vt‘llrn‘kntl N. lf'iswnnndlmn
`
`D. Quentin McDonald, and W. Clark Still
`
`Volume 19. Number 7 was mailed llieiveel; of April [3.
`
`I‘JUS
`
`rCtmtinncn‘ on next page)
`
`Visit Wiley Journals Onllne
`'E‘WILEY
`
`Inter8o>ience:'
`
`www.lnlersclence.wlleg.com
`
`Petitioner Microsoft Corporation - Ex. 1012, ToC i
`Petitioner Microsoft Corporation - Ex. 1012, ToC i
`
`
`
`(Ctmtmmni)
`
`Method of Calculating Band Shape for Molecular Electronic Spectra
`(311137 M. Pearl, Mr C. Zrmcr, Andcrs Bron, and [aim A/lclx’t‘lvqi;
`
`I
`
`Neighbor-List Reduction: Optimization for Cmnputation of Molecular
`van der Waals and Solvent-Accessible Surface Areas
`
`[iii-g Wefsw', Armin A. Weiss}; Prter S. Shenkiu, mm‘ W. Clark Still
`
`Erratum
`
`731
`
`79‘;
`
`809
`
`Petitioner Microsoft Corporation - Ex. 1012, ToC ii
`Petitioner Microsoft Corporation - Ex. 1012, ToC ii
`______________________________—_
`
`
`
`fi_fl__mmr_wm
`
`Development of a Parallel Molecular
`Dynamics Code on SIMD COmputers:
`Algorithm for Use of Pair List Criterion
`
`—_——“_'_‘—'—“—-—~—-—~—~—-—._..___
`
`D. ROCCATANO,| R. BIZZARHLZ'3 G. CHILLEMIE N. SANNA,2
`A. DI NOLA1
`Jli).",iriri'iii.=ii'iifo iii Cliiiiiica, tlnitr'i‘sih‘i ”Li? Siillii'iiiil,“ Pie Aldo More '5, 001'85 Rome, Italy
`JCz-lSPLiR Consortia per lr i’lppiiriijoni iii Siiprrcrilcolo per Lliiioersilr‘r e Riceror, r/o Um'errsih‘i “ Ln
`fiopii'iira," Rona: ital};
`’Dipiirtiiiieiilo di Fisica, Linierrsitii ”in Sapiriim,” Rome. Holy
`
`Received 24 April 1996; accepted 24 October 1‘99?
`
`
`
`,_.-
`
`r
`
`,...
`'
`'
`
`'
`
`J,
`
`.
`'1
`
`{it
`ABSTRACT: In recenl years several
`implementations of molecular dynamics
`_
`t'MDl codes have been reported on multiple instruction multiple data (NHMQJ‘Q? 3,“.
`machines. However, very few implementations of MD codes on single instruction?-
`multiple data (SIMD) machines have been reported. The difficulty in using pair
`lists of nonbonded interactions is the maior problem with MD codes for SIMD
`machines, such that, generally, the full connectivity computation has been used.
`We present an algorithm, the global cut-off algorithm (GCA), which permits the
`use of pair lists on SIMD machines. GCA is based on a probabilistic approach
`and requires the cut-off condition to be simultaneously verified on all nodes of
`the machine. The MD code used was taken from the GROMOS package: only
`the routines involved in the pair lists and in the computation of nonbonded
`interactions were rewritten for a parallel architecture. The remaining calculations
`Were performed on the host computer. The algorithm has been tested on
`Quadrics computers for configurations of 32, 128, and 512 processors and for
`systems of 4000, 8000, 15,000, and 30,000 particles. Quadrics was developed by
`lstituto Nazionale di Fisica Nucleare (INFN) and marketed by Alenia Spazio.
`(03: 1998101111 Wiley 8: Sons, Inc.
`I Comput Chem 19: 685—694, 1998
`
`Keywords: molecular dynamics; domain decomposition algorithm; parallel
`con-iputers,‘ pair list for molecular dynamics code on SIMD machines; array
`processor elaborator {APE}
`
`Cortes;minivan“ to: A. Di Nola
`C(‘Inl'l'act/grant sponsors: Elite per le Nuove 'l'ecnologie,
`l’Energia e |'.-3inihiente; Alenia Spazio
`
`Journal of Computational Chemistry. Vol. 19, No. 7, 685—694 (1998}
`'9. 1993 John Wiley 3, Sons. Inc.
`-
`_
`
`000 0192-8651 198i070685~10
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 685
`Petitioner Microsoft Corporation - EX. 1012, p. 685
`
`
`
`HOCCATANO ET AL.
`
`int reduction
`
`C lassical molecular dynamics (MD) is used to
`
`study the properties of liquids, solids, and
`molecules.L2 The Net-yton equation of motion for
`each particle of the system is solved by numerical
`integration and its trajectory is obtained. From this
`microscopic point of view, many microscopic and
`macroscopic properties can be obtained. The need
`for numerical
`integration limits the time step to
`the femtosecond scale and makes MD simulation a
`very time consuming task. Therefore, considerable
`efforts have been concentrated on optimizing MD
`codes on parallel computers of different architecw
`tures.
`
`Parallel computers are frequently described as
`belonging to one of two types: single instruction
`stream multiple data stream (SIMD), or multiple
`instruction stream multiple data stream (MIMD).
`in general, SIMD machines have a simpler archi-
`tecture, but
`they have hardware.
`limitations be—
`cause the same instruction is executed in parallel
`on every SIMD processor and, furthermore, some
`‘51th machines do not have local addressing; that
`is, the processors are not able to access their own
`memory using different local addresses. In recent
`years, several MD codes have been implemented
`on MIMD architectures with a few dozen of
`processors“; and, more recently, also on 100- to
`lUOO—processor MIMD machines.""S Parallel imple~
`mentations of biological MD programs such as
`CHARMM“ and GROMOSI" on MIMD machines
`have been discussed in the lite1'ature.”'”"3
`Less work has been done using SIMD
`systems.”"7 In general, they make use of the full
`connectivity computation;
`that
`is, all atom pair
`interactions are calculated, and are useful
`for
`long-range force systems. This is due to the diffi-
`culty of using pair lists of nonbonded atoms on
`SIMD machines with no local addressing.
`In the present study we propose an algorithm
`that permits the use of pair lists in a MD code for a
`SIMD machine with no local addressing. The algo—
`rithm requires simultaneous use of multiple time
`step”) and geometric decomposition” methods. In
`addition,
`the systolic loop‘“ method is used to
`further reduce computation time.
`The method was tested on Quadrics com-
`puters,'”'31 a class of SIMD machines developed
`by INFN and Alenia Spazio, for configurations of
`32, 128, and 512 processors. Quadrics is the only
`
`massive parallel computer dueloiiaed with fully
`[European technology.
`..-\s
`the Europorlls'I’AL‘L‘
`project‘1 has shown,
`the scalability for a MD
`code on MIMD architecture, for complex systems
`such as a protein in solution, is generally satisfac-
`tory only up to 12—15 nodes.
`Moreover,
`there are interesting projects being
`undertaken on mixed architecture MlMDl/SlMD
`machines that could supply the computational
`power of a SlMD machine, together with the flexi-
`bility of a MIMD.
`It
`is therefore wortln-vhile to
`determine whether these machines are able to per-
`form such calculations.
`The following molecular systems have been
`used as tests:
`
`.
`
`.
`
`n
`
`a
`
`S‘Iffilt’il'l
`atoms).
`
`'i: Box of 1536 water molecules {4608
`
`System 2: Box with a BPTI (bovine pancreatic
`trypsin inhibitor) molecule and 2712 water
`molecules (8704 atoms).
`
`System 3: Box with a SOD {super-oxide dis-
`mutase) dimer and 42% water molecules
`(15,360 atoms).
`
`Sysfriii at: Box with a SOD (superoxide dis—
`mutase) dimer and 93—16 water molecules
`(30,720 atoms).
`
`It should be noted that system 4 is nearly the same
`as test case 13, used as the industrial benchmark in
`the framework of the EurosportZ/PACC project”
`(system 4 has 9346 water molecules whereas test
`case I3 has 9436 water molecules). The results
`show that the speed—up of the algorithm is compa-
`rable to those obtained with MIMD machines.
`
`
`
`Hardware
`
`We tested the method on a Quadrics machine,
`Alenia Spazio’s supercomputer derived from the
`APElOO (Array Processor Elaboratorl project, de—
`veloped by INFNJQ'll These computers are a fam-
`ily of massively parallel SIMD machines with im—
`plementations from 8
`to 2048 processors. The
`biggest implementation allows a peak computing
`power of 100 GFlops in single precision {32—bit
`processors).
`The processors are arranged in a three—dimen-
`siona]
`l3D) cubic mesh and can exchange data
`with the six neighboring nodes, with periodic
`boundaries. Each processor board contains eight
`processors (floating point units) with their own
`
`686
`
`
`
`
`VOL. 19. NO. 7
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 686
`Petitioner Microsoft Corporation - EX. 1012, p. 686
`———-———__________________________
`
`
`
`memory [4 megabytes). Up to lo boards can be put
`into a crate. Configurations with more than 128
`processors are made up connecting crates of 8 ::<
`—l X 4 ('l28) nodes.
`
`The Quadrics controller board contains one inte-
`ger CPU lZ~CPU), which controls the flux of the
`program and the integer arithmetic. The language
`used is called Tao, a Fortran—like high—level paral~
`lel language, which can be modified and extended
`through a dynamic ZZ parser. The Quadrics ma-
`chine is connected to a host computer (a Sun
`Spare-10 or -20). A host interface based on a HlPPl
`standard, which allows an l/O speed between the
`host and Quadrics of 20 MB/s, has recently been
`developed. The tests on the sequential machine
`have been run on a DEC—alpha 3000/3100 machine.
`Barone et al.22 compared the accuracy of Quadrics
`in the field of molecular dynamics with that of a
`com-‘entional computer to assess the limits of the
`single precision.
`
`—--—--—-—._..._.____,________
`
`Molecular Dynamics
`
`In a molecular dynamics simulation, the classi—
`cal equations of motion for the system of interest
`are integrated numerically by solving Newton's
`equations of motion:
`
`our]
`tin-E3— = —‘Tl.-lr’{ r1, rpm, r”)
`
`The solution gives the atomic positions and veloci-
`ties as a function of time. The knowledge of the
`trajectory of each atom permits study of the dy-
`namic or statistical properties of the system. The
`form of the interaction potential is complex and it
`includes energy terms that represent bonded and
`nonbonded (van der Waals and Coulombic) inter—
`actions:
`
`lfli'],l‘:,...,l‘l\il
`
`1
`1
`,
`1
`= Z —i<,,[ii—ii,_,r+ 2 —K,le—ri..]-
`bonds 2
`angles 2
`'l
`
`'l‘ E EKglf— gull
`improper
`d i hed rals
`
`+ E qul + cosine — 8}]
`cl Elledrals
`
`+ Z
`pairsli', f]
`
`Elsi-Ii
`
`
`”n-
`”a
`s- a
`I'll
`fl
`rt;
`1-
`'
`if
`ii
`' 7T6” in
`
`the bonded poten—
`The first four terms represent
`lial,
`ll,
`l',,, and K}, are the cIEQth'll-ljlll’lt'l
`lenelli, its
`reference value, and the bond stretchiru; force con--
`stant, respectively. H, H“, and l<,, are the actual
`bond angle,
`its
`reference value, and the angle
`bending force constant, respectively; 5, {5d, and K,
`are the actual
`improper dihedral angle,
`its refer-
`ence value, and the improper dihedral angle bend—
`ing force constant,- respectively (c, K,_,
`ll, and Bi
`are the actual dihedral angle, its force constant, the
`multiplicity, and the phase, respectively. The last
`term in the equation includes the nonlmndecl, van
`der Waals, and Coulombic terms. a” and o},- are
`the dispersion well depth and the Lennard—Jones
`distance,
`i},- and ii,- are the electrostatic atomic
`charges, r“-
`is the distance between them, and i:
`is
`the dielectric constant.
`The time step used for the numerical integration
`is in the femtosecond scale. The highest frequency
`of bond Vibrations would require a time step 3 0.5
`is; however, it the simulation is performed with
`constant bond lengths, the time step can be 3‘ 2 is.
`For this reason, many MD codes perform simula-
`tions with constant bond lengths.
`The most frequently used algorithm to perform
`MD simulation at constant bond lengths is
`the
`SHAKE algorithm based
`on
`an
`iterative
`procedure.33
`
`——-—-————___.________
`
`Computational .-\Ig0rillun for
`Nonlmmled interactions
`
`In a MD program, the calculation of the non-
`bonded forces is the most time-consuming task—in
`fact, it takes about 900? of the computational time,
`depending on the protocol used.
`One of the most frequently used techniques to
`reduce the number of nonbonded forces is the
`cut-oil" criterion. With this method the interactions
`between atoms beyond a cut-off distance are ne-
`glected. If the cut-off radius is appropriate the lost
`energy contribution to the global potential is small.
`During a small number of steps the pairs of inter-
`acting atoms are considered to remain the same so
`that it is possible to create a list of these interac-
`tions, the nonboncled pair list, which will be up-v
`dated every ii steps in is generally equal to 10).
`The number of nonbonded interactions is NlN —
`ll/2 (N is the number of atoms), so that
`it
`is
`proportional to N1. The use of the cut-off criterion
`reduces this number to icN (k is a constant}.
`
` JOURNAL OF COMPUTATIONAL CHEMlSTFlY
`687
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 687
`Petitioner Microsoft Corporation - EX. 1012, p. 687
`
`
`
`FiOCCATANO ET AL.
`
`the cut—off criterion is not di-
`Unfortunately,
`rectly applicable to SIMD architecture, as the same
`instruction is executed at each time on each pro-
`cessor and, consequently, it is not possible to have
`a local branch (the cut-off condition) in the pro—
`gram flow. Morem'er, on Quadrics it is not possi-
`ble to have a local pair list on each node because
`local addressing of arrays is not possible. This
`explains the lower level of efficiency of a MD code
`on a SIMD machine with respect to a MIMD one.
`We have recently developed an algorithm,
`the
`global cut—off algorithm (CGA), based on a proba-
`bilistic approach, which allows the use of the cut—
`off condition on a SIMD machine with no local
`addressing.
`Because the calculation of bonded interactions
`and the integration of the trajectory take a small
`amount of the total calculation time, in the present
`version we have chosen to carry them out on the
`front-end computer and to perform only the calcu-
`lations of the pair list and of the nonbonded forces
`using the SiMD machine. It is, of course, possible
`to perform all force calculations and integration in
`the parallel machine using, for instance, the repli-
`cated data method.“
`
`GI'IUMI'I'I‘RIC DECO-.‘tll’llSI'l‘ION
`
`The assignment of the atoms to the nodes is
`obtained by a dynamically geometric decomposi~
`tion” in such a way that
`the same number of
`atoms is assigned to each node. In what folltn-vs,
`We discuss a decomposition for a bidimensional
`case; the extension to a third dimension is straight-
`forward: given the bidimensional box of Figure la
`and a 2D parallel topology of ii = a, X it” proces~
`sors, with it‘. = il,, == 2,
`the box is first divided
`into Jr, boxes along the .r—axis, as shown in Figure
`1b, each containing the same number of atoms.
`Each box is successively divided into ll” boxes
`along the jraxis in such a way that each one of the
`H". X n” boxes contains the same number of atoms
`(Fig. 1c}. When, as in a real case, a third dimension
`exists, a successive division along the :—axis has to
`be performed.
`It is obvious that, before performing any divi—
`sion along a given axis, it is necessary to sort the
`atoms of each box along that axis.
`The density of a molecular system, such as a
`protein,
`is not uniform;
`thus,
`the boxes do not
`have the same axis lengths. However, these differ-
`ences do not significantly reduce the efficiency of
`the GCA described in what follows.
`
`('1
`
`D
`
`FIGURE 1. Domain decomposition of the molecular
`system in boxes with the same number of atoms, for a
`bicllmensional case.
`
`Fit} "I‘D ”(J LOOP lll‘i'l‘l l0!)
`
`Quadrics topology makes it possible to use
`a systolic loop to calculate the nonbonded inter—
`actions between the atoms assigned to the dif-
`ferent nodes. The systolic loop method is one of
`the most efficient algorithms for calculation of
`ti-vo-body
`interactions
`on MIMD and ESIMD
`machines.”- 1'" “'3'; The systolic loop algorithm
`passes the coordinates of all atoms around a ring
`of P processors in [3/2 steps, such that half of the
`coordinates passes every processor exactly once
`(transient atoms). Each node also stores the coordi—
`nates of a group of atoms of the overall system
`(resident atoms}. During the systolic cycle each
`processor evaluates and accumulates the interac—
`tions of the resident atoms with the transient ones.
`Only half of the atoms have to pass in each com-
`putational node as a consequence of the reciprocity
`of the interactions.
`
`The systolic loop path for a 32-node Quadrics
`machine is shown in Figure 2. This machine has
`two nodes along the y and : directions and eight
`along the .r direction.
`The geometric decomposition of the system per—
`mits limitation of the search for nonbonded inter—
`actions only to the neighboring processors nearer
`than the cut-off radius, so that, depending on the
`number of nodes and on the system size,
`it
`is
`generally not necessary to perform the complete
`systolic loop. The computed forces are passed back
`to the m-vning processor to accumulate the full
`force.
`
`liliflllui (IVY-0P1" ALGORI'I‘IHI
`
`On a SlMD machine, all nodes simultaneously
`evaluate an interaction, but the atom pairs in each
`node are different. Figure 3 shows, as an example,
`
` 638
`VOL. 19, NO. 7
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 688
`Petitioner Microsoft Corporation - EX. 1012, p. 688
`-—-—————_____________________
`
`
`
`
`
`E
`
`
`\ RiGHT (x)
`\\
`robmm
`FIGURE 2- SYStOHC 100i? path for node {Oporol 0* a
`32'”°d9 Wadi?“ maChine- The ‘rflnsjent groups of
`atoms visit only four neighboring y—z pianes, based on
`‘
`Newton's third law,
`
`the case with four nodes: suppose that each node
`is evaluating the interaction in;
`in this case, all a,
`interactions fall within the cut-off radius. When
`the interactions are of the ii, type all the distances
`fall outside the cutoff radius and the interactions
`it, are skipped. in the case of interactions of type. c,-
`the interaction is outside the cutoff radius in nodes
`'1, 2, and 3, but it is inside the cut—off radius in
`
`node i, so that all nodes have to calculate this
`interaction and oni_\' will be sated in the forays
`calculation. If the atoms in each node are ordered
`ranclon'ily,
`the interactions of type c,
`result
`in
`being the most frequent.
`The. basis idea of the global cut-off algorithm
`(GCAJ is to maximize the occurrence of interac-
`tions of type a, and i1,- and, conversely, reduce the
`occurrence of interactions of type t',. To this end, it
`is necessary that the atoms in all nodes are sorted
`with the same criterion. Different types of sorting
`give comparable results. We have chosen the one
`shot-en in Figure 4. After this sorting procedure, a
`list of the interactions of type a, and rI
`is created
`in the integer CPU (Z~CPU) of the SIMD machine.
`This l'st
`is e Liv'al’nt, but not iientical,
`to the
`q I
`L
`L L
`I
`nonbonded pair list used in most MD programs
`and will be referred as the nniibomied pair iist.
`The ordering procedures for the domain decom-
`position and the sorting procedure previiaushr dcw
`scribed are time consuming and have to be per-
`formed on the host serial machine; however, as
`will be shown, they have to be performed every
`100 to 200 steps so that they do not significantly
`affect the global computation time.
`The global cutoff condition is based on a proba—
`bilistic approach, so that the number of pair inter-
`actions to be calculated is larger than the actual
`number of pairs included within the r,ut distance.
`Depending on the molecular system and on the
`number of nodes, the ratio between the number of
`the calculated interactions and the number of in-
`teractions actually included within the cut—oil" dis—
`
`
`
`
`
`
`
`Y
`
`X
`
`
`
`
`FIGURE 3. Different types of interactions in a case of
`four nodes. 5-,, all the interactions fall within the cutoff
`distance; b,, ail the interactions fall outside the cut-off
`distance; 0,, one interaction falls within the cut-oft.
`whereas three fall outside the cut-off. P.U.= processor
`unit.
`
`FIGURE 4. Sorting of atoms in each node for a
`bidimensionei case. The atoms are represented as full
`circles.
`
`
`
` JOURNAL OF COMPUTATIONAL CHEMISTRY 689
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 689
`Petitioner Microsoft Corporation - EX. 1012, p. 689
`—-———-—_____________________
`
`
`
`ROCCATANO ET AL.
`
`tance varies from the three to five. in this sense,
`the GCA is not vet)" efficient. l—lt'1\\-'E\-‘el‘,
`it must he
`noted that almost all pair interactions within a
`distance rm, < r E, rm, + Ar, with Ar a 0.2 nm,
`are calculated. As an example, given rm,
`=—- 06 run,
`949}- of the interactions in the range 0.6 < r <_: 0.8
`nm are calculated and only 69?. of pairs in this
`range are lost. This suggests'adoption of a cut-
`off value of in“, to be used in the global cut—off
`condition, somewhat shorter than the actual cut—off,
`rm, desired for the simulation.
`in the previous
`example, with the». = 0.6 and rt,th = 0.8 ran,
`the
`number of pairs to be calculated is roughly two-
`to—threefolcl larger than the actual number of pairs
`within the run distance. The remaining 60? of
`interactions in the range 0.6 < r g 0.6 n1n have to
`be calculated separately.
`It
`is well known that nonbonded forces vary
`more slowly than the bonded ones. Moreover, non—
`bonded forces at large distances vary slower than
`nonbonded forces at short distances. This suggests
`updatingr of the forces at different steps, according
`to their nature (bonded and nonbonded) and to the
`distance of the interaction. The short—range in terac—
`tions can be evaluated every step, and long-range
`interactions every in steps. Accordingly,
`the few
`interactions in the range 0.6 c r g 0.8 run that
`i-vere lost using "cca = 0.6 nm can be updated
`every in steps. As these interactions are calculated
`while evaluating the ironlJemimi pair list
`ti.e., up-
`dated every n steps}, we have chosen in = u = 10.
`It must be noted that there are now two shells:
`an inner shell (r g 0.6 um) and an outer shell
`(0.6 < r s 0.8 um). All interactions are evaluated
`every in steps, whereas only those interactions
`corresponding to the inner shell are evaluated ev—
`ery step. It is therefore not necessary to have a skin
`distance and to store a list of atom pairs outside
`the outer shell.
`
`t)» seen that most illit’l'm'.
`in the present study it
`tions in the range 0.h to 0.8 nm are a.:-valuatcd every
`step and only a few of them are evaluah-irl every in
`steps. According to all of the MTS algorithms, this
`choice does not affect the numerical accuracy; in
`fact, the same accuracy is obtained when an inter—
`action, within the outer shell,
`is evaluated every
`short time step or e\-'er_v long time step. i-iou’ever,
`as every long time step all interactions within the
`outer shell are evaluated, it is possible to perform
`the MTS according to the classical procedure; that
`is, by collection all
`the interactions within the
`outer shell at every long time step and collecting
`only the interactions within the inner shell at every
`short
`time step. Among several algorithms pro—
`posed for the multiple time step (NtTSlLi5‘3l‘ we
`have chosen the one developed by Scully and
`I-lermans.“
`
`it must be noted that all nonbonded interactions
`(van der Waals and Coulombic) between bonded
`and nonbonded atoms are calculated in this step,
`but the interactions between bonded atoms are not
`saved. This is obtained by attaching to each atom a
`list of the atoms bonded to itself. This procedure is
`certainly not efficient, but
`the time required to
`perform it is negligible. In the following tests, the
`values of foot and rm, are fixed at 0.6 and 0.8 nm,
`respectively.
`
`Results
`
`Table I shows the number of interactions within
`the cut-off radius rm, compared with the number
`of interactions to be evaluated with the GCA. It
`can be noted that the number of interactions for
`calculation is two to three times the actual number
`of interactions within the cut-off radius. The time
`
`TABLE I.
` Number of Actual Interactions, N, within a Cut-0ft Radius (rcul) = 0.8 nm Compared with the Number of
`Interactions Calculated Using GCA on 32-, 128- and 512-Node Ouadrios Machines.
`
`N
`32 nodes
`128 nodes
`512 nodes
`
`System i
`873,984
`840,320
`r06.944
`488,847
`(4608 atoms)
`System 2
`2,070,016
`1.810.944
`1 1312.864
`891.252
`(8704 atoms)
`System 3
`4,452,864
`3,768,576
`3,153,344
`1.48?,054
`{15.360 atoms)
`System 4
`9,938,944
`8,249,218
`8,754,240
`3,129,972
`{30,720 atoms)
`
`
`
`690
`VOL. 19, NO. 7
`
`Petitioner Microsoft Corporation - Ex. 1012, p. 690
`Petitioner Microsoft Corporation - EX. 1012, p. 690
`
`
`
`required for performing the MD simulation with
`the present code is the sum of the followingr steps:
`
`[per—
`
`1. Ordering procedure (performed everv it
`steps by the host computer}.
`2. Calculation of the “outmoded pair list
`formed everv H steps by Quadrics).
`3. Calculation of the nonbonded forces {per—
`formed every step by Quadricsl.
`3a. Calculation of the bonded forces by the host
`computer while performing step 3.
`4. I/O host H Quadrics.
`31'!
`SHAKE and integration {performed by the
`host).
`
`The ordering procedure is a time-consuming
`task and, due to the diffusion of the system, it has
`to be periodically repeated every It steps. If no
`reordering is performed the iroirlioiitled pair list will
`include an increasing number of interactions, thus
`increasing the computational time. Figure 5 shows
`
`0
`
`20
`
`R N
`
`umberofinteractions:4T0" 53'
`
`-
`40
`
`—~
`50
`
`.
`
`-
`100
`
`so
`
`120
`
`140
`
`—~
`10