`
`1, Rachel J. Watters, am a librarian, and the Director of Wisconsin TechSearch
`
`(“WTS”), located at 728 State Street, Madison, Wisconsin, 53706. WTS is an
`
`interlibrary loan department at the University of Wisconsin—Madison.
`
`I have worked as
`
`a librarian at the University of Wisconsin library system since 1998.
`
`I have been
`
`employed at WTS since 2002, first as a librarian and, beginning in 201 1, as the Director.
`
`Through the course of my employment, 1 have become well informed about the
`
`operations of the University of Wisconsin library system, which follows standard library
`
`practices.
`
`This Declaration relates to the dates of receipt and availability of the following:
`
`Roccatano, D., Bizzarri, R., Chillcmi, G., Sanna, N., & Di Nola,
`
`A. (1998). Development of a parallel molecular dynamics code on
`SIMD computers: Algorithm for use of pair list criterion.
`Journal of computational chemistry, 19(7), 685—694.
`
`
`Standard ogeratingprocedaresfor materials at the Universzfl at Wisconsin-
`
`Madz'son Libraries. When a volume was received by the Library, it would be checked
`
`in, stamped with the date of receipt, added to library holdings records, and made
`
`available to readers as soon after its arrival as possible.
`
`"I“he procedure normally took a
`
`few days or at most 2 to 3 weeks.
`
`Exhibit A to this Declaration is true and accurate copy of the journal cover with
`
`library date Stamp of Journal ofcomputational chemistry (1998), from the University of
`
`Wisconsin—Madison Library collection. Exhibit A also includes an excerpt of pages 685
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 1
`Petitioner Microsoft Corporation - EX. 1067, p. 1
`
`
`
`Declaration of Rachel J. Watters on Authentication of Publication
`
`to 694 of that volume, showing the article entitled Development ofa parallel molecular
`
`dynamics code or: SIMD computers: Algorithm for use ofpair list criterion (1998).
`
`Based on this information, the date stamp on the journal cover page indicates
`
`Development ofa parallel molecular dynamics code or? SIMD computers: Algorithm for
`
`use ofpair list criterion (1998) was received by University of Wisconsin-Madison
`
`Libraries on April 27, 1998.
`
`Based on the information in Exhibit A, it is clear that the volume was received by
`
`the library on or before April 27, 1998, catalogued and available to library patrons
`
`within a few days or at most 2 to 3 weeks after April 27, 1998.
`
`I declare that all statements made herein of my own knowledge are true and that
`
`all statements made on information and belief are believed to be true; and further that
`
`these statements were made with the knowledge that willful false statements and the like
`
`so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18
`
`of the United States Code.
`
`Date: August7,2018
`
`f??? gill? ;L_f..
`
`Rachel J.“ atters
`
`Wisconsin TechSearch
`
`Director
`
`Memorial Library
`728 State Street
`
`Madison, Wisconsin 53706
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 2
`Petitioner Microsoft Corporation - EX. 1067, p. 2
`
`
`
`
`
`
`EXHIBIT A
`EXHIBIT A
`
`
`
`Journal ol‘
`
`
`
`omputational
`Chemistry
`
`ORGANIC f INORGANIC f PHYSICAL f BIOLOGICAL
`
`Editors:
`
`Norman L. Allinger
`Paul von R. Schleyer
`
`
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 3
`
`
`
`
`
`Journal of
`
`Computational
`emistry
`
`ORGANIC ! INORGANIC! PHYSICAL l BIOLOGICAL
`
`Editors:
`
`
`Editorial Advisory Board:
`
`Norman L. Allinger
`Department of Chemistry
`Computational Center for
`Molecular Structure and
`
`Design
`University of Georgia
`Athens, GA 30602-2526
`
`Paul mm H. Schleyer
`Computer Chemistry Center
`Institut fiir Organische Chemie
`Universitat Erlangen-Niirnberg
`Henkestrasse 42, D-91054
`Erlangen, Germany
`
`Associate Editor:
`
`Gemot Frenking
`Fachbereich Chemie
`
`Pliilipps-Universita't Marburg
`I-Ians—Meerwein—Strasse
`
`D—35032 Marbu rg, Germany
`
`Assistant Editors:
`
`Martin Feigel
`Institut for Organische Chemie
`Universitat Bochum
`Universitatsstr. 150
`D44780 Bocllum, Germany
`
`Enrico Clementi
`Department of Chemistry
`University L. Pasteur, 3
`rue de l'Université
`
`67084 Strasbourg, France
`
`Warren J. Hehre
`Wavef-unction, 18401 Von
`Karman, Suite 370
`Irvine, CA 92715
`
`William L. Jorgensen
`Department of Chemistry
`Yale University
`PO Box 208107
`New Haven, CT 06520—8107
`
`Martin Karplus
`Department of Chemistry
`12 Oxford Street
`
`Harvard University
`Cambridge, MA 02138
`
`Peter A. Kollman
`Department of Pharmaceutical
`Chemistry
`School of Pharmacy
`University of California
`33“ Francisco, CA 94143
`
`Roger S. Grev
`Department of Chemistry
`University of Kentucky
`Lexington, KY 40506
`Editorial Produco‘on. John Wiley: Medlee Croft Olson
`
`Paul G. Mezey
`Department of Chemistry
`University of Saskatchewan
`Saskatoon, Canada S7N 0W0
`
`Kem Morokuma
`Department of Chemistry
`Emory University
`Atlanta, GA 30322
`
`Eiii Osawa
`Department of Knowledge-
`Based Information
`
`Engineering
`Toyohashi University of
`Technology
`Tempa ku—cho, Toyohashi
`441, Japan
`
`John A. Pople
`Department of Chemistry
`Northwestern University
`2145 Sheridan Road
`Wilmette, IL 60208
`
`Peter Pulay
`Department of Chemistry
`University of Arkansas
`Fayetteville, AR 72701
`
`Leo Radom
`Research School of Chemistry
`The Australian National
`
`University
`Canberra, A.C.T. 0200
`Australia
`
`Harold A. Scheraga
`Baker Laboratory of Chemistry
`Cornell University
`Ithaca, N Y 14853-1301
`
`Andrew Streitwieser, Jr.
`Department of Chemistry
`University of California
`Berkeley
`Berkeley, CA 94720
`
`Kenneth B. Wiberg
`Department of Chemistry
`Yale University
`225 Prospect Street
`New Haven, CT 06520
`
`Michael C. Zerner
`Department of Chemistry
`University of Florida
`Gainesville, FL 32611
`
`_.—
`
`cessing a change ol address. For subscdplion inquiries. please call 212-050-5645:
`SUBINFO@jwiley.com
`Poslmaslei: Send address changes to Journal of Computational Chemislgr. Caroline Rolhtillfl-
`Director. Subscdpiion Fulfillment and Distribution. John Wiley 8. Sons.
`inc. 605 Third Averill.
`New York. NY 10150. For
`subscn‘plion inquiries. please call customer service at 212-350
`6645 or mile to the above address.
`
`E—mafli
`
`1
`.
`j
`
`_
`
`I
`'
`
`The Joumel ol Computational Chemistnt {ISSN: 0192-8651} is published 16 limes a year. monthly and
`semi-monthly in January. April. July. and November. one volume per year by John Wiley tit Sons. |no.:
`605 Third Avenue, New York. NY. 10158.
`this publication
`rights reserved. No part ol
`Inc. All
`Copydght © 1998 John Wiley 3. Sons.
`may be reproduced in any lonn or by any means. except as pennitled under seclion 10? or
`100 of
`the 19T6 United States Copydghl Act. without either
`the prior written permission of
`the publisher. or authorization through the Copyright Clearance Center. 222 Rosewood Drive.
`Level:
`Susan
`to
`forwarded
`be
`should
`advertising
`concerning
`Inquin'es
`Advertising:
`Danvers. Milt 01923:
`leI: 503-?50-3400.
`fax: 500-?50-44?0. Periodicals poslago paid at New
`212-
`Inc. 605 Third Avenue. New York. NY 10153.
`Advertising Sales. John Wiley 8. Sons.
`York. NY. and at additional mailing offices.
`850-8032. Advertising Sales. European Contacts: Bob Kern or Nicky Douglas. John Wiley & Sons. 01.
`indicate the oopyrighl holders consent
`Ihe journal
`The code and the copyrighl notice in
`Batons Lane. Chlchester. West Sussex P019 iUD. England. Tel: 44 1243 no 350l36?; Fax: 44 1243
`inlemal use. or tot
`[be personal or
`inlemal use oi
`Ihal copies may be made [or personal oi
`in] 432: email: adsales@wiley.co.uk.
`specific clienls. on the condition that
`the cooler pay for copying beyond Ihal pennitled by
`Reprints: Reprint sales and Inquiries should be directed to the customer service depanmefll
`John Wiley Bx Sons. Inc. 605 Third Ave. New York. NY 101513: tel: 212-850-0ii‘6.
`Sections 10? and 100 of
`the United States Copynghl Law. The per-copy ice for each item is
`to be paid through Ilia Copyright Clearance Center. Inc.
`Manuscripts should he submitted to one of
`lhe editors: Dr. Ndnnan L. Allinger. Journal Ill
`copying for general
`such as
`of copying.
`This consanl does nol extend lo other
`ldnds
`Coinpulalt'onal Chemistiy. Department
`ol Chemisliy. The University ol Georgia. Athens. GI
`dislnbulion.
`[or advertising or promotional purposes.
`lot
`creating new collective works. or
`30602:
`Prol. Dr.
`Paul
`voo R.
`Schleyer.
`Journal oi Computational Chemistry. Comm“ !
`[or
`resale. Such permission requests and other permission Inquiries
`should be addressed to
`Chemistry Center.
`lnstitul
`lfir Organische Chemie. Universillil Ellangen-Nflrnhmg. Henlieslr.
`ill.
`Ihe Pemiisslons Dept.
`D—9105-i. Edaogen. Germany: or Prof. Gemol Frenlting. Joumal of Computational Chendsllil
`Subscription price (Volume 10. 1998]: $1,190.00 In US. 31.35000 in Canada and Mexico.
`Fachberelch Chemie. Phllipps-Unlversilal Marburg. Hans-Meemein-Sltasse. 0—35032 Marburg, Germani-
`$1505.00 outside North Amenca. $150.00 US ACS member. $246.00 outside US ABS member. Per—
`lnlonnaiioh lot contributors appears In the first and Iasl issue of each volume. All other corresponded“
`sonal tale: 5150,0001 US. $246.00 outslde Norlh America. Subscriptions at the personal tale are avail—
`should be addressed in Journal at Computational Chemistry. Publisher. Interscience Division. Proles'
`sional. Reletcrtce. and Trade Group. John Wiley 8: Sons Inc.. 505 Third Avenue. New York. NY10150.
`able only lo Individuals. All subscn'plions outside US will be sent by air. Payment must be made In US
`dollars drawn on a US bank. Claims lor undelivered copies will be accepted only alter Ihe following issue
`The contenls oi lhis journal are indexed in the following: Chemical Abstracts. Chemical Titled.
`has been received. Please enclose a copy or the melting label. Missing copies will be supplied when
`Current ContentslPhyslcal. Chemical.
`ll Earth Sciences. Mathematical Reviews,
`llelerehce Coda":
`Research Alert 051). Science Citation index {El}. and SClSEAllCH Database (IS-l}.
`losses have been sustained in transit and where reserve slock permits. Please allow Iourweelts for pro-
`«3- This paper meets ll‘ie requirements of ANSIFNISC 23913-1 992 {Pennanenpe ol paper}.
`Petitioner Microsoft Corporation - Ex. 1067, p. 4
`Petitioner MICrosoft Corporation - EX. 1067, p. 4
`
`.
`
`‘
`-
`
`J
`
`
`
`- I
`
`Deyelopment of a Parallel Molecular
`Dynamics Code on SIMD Computers:
`Algorithm for Use of Pair List Criterion
`
`D. ROCCATANO,1 R. BIZZARRI}3 G. CHILLEMI,2 N. SANNA,2
`A. DI NOLA1
`
`lDiportimento di Chit-nice, Linioei'sitii ”La Sapienza,” Pie Aldo Moro 5, 00185 Ron-re, Italy
`3CASPUR Consorzio per le Applicozioul ell Stipermlcolo per Ultlversita e Ricerco, C/o Universitfi ”La
`Sapienza,” Rome, Italy
`'J’Dipm'timento di Fisica, Union-site ”La Sapienza,” Rome, ltaly
`
`Received 24 April 1996,- aceepted 24 October 1997
`
`ABSTRACT: In recent years several implementations of molecular dynamics
`(MD) codes have been reported on multiple instruction multiple data (MIMD)
`machines. However, very few implementations of MD codes on single instruction
`multiple data (SIMD) machines have been reported. The difficulty in using pair
`lists of nonbonded interactions is the major problem with MD codes for SIMD
`machines, such that, generally, the full cormectivity computation has been used.
`We present an algorithm, the global Cut-Off algorithm (GCA), which permits the
`use of pair lists on SIMD machines. GCA is based on a probabilistic approach
`and requires the cut-off condition to be simultaneously verified on all nodes of
`the machine. The MD code used was taken from the GROMOS package; only
`the routines involved in the pair lists and in the computation of nonbonded
`interactions were rewritten for a parallel architecture. The remaining calculations
`were performed on the host computer. The algorithm has been tested on
`Quadrics computers for configurations of 32, 128, and 512 processors and for
`systems of 4000, 8000, 15,000, and 30,000 particles. Quadrics was developed by
`lstituto Nazionale di Fisica Nucleare (INFN) and marketed by Alenia Spazio.
`© 1998 Jolm Wiley 8:: Sons, Inc.
`J Comput Chem 19: 685—694, 1998
`
`Keywords: molecular dynamics; domain decomposition algorithm; parallel
`computers; pair list for molecular dynamics code on SIMD machines; array
`processor elaborator (APE)
`
`Correspu:niem‘e to: A. Di Nola
`Contract/grant sponsors: Ente per le Nuove Tecnologie,
`I’Energia e I’Ambicntc; Alcnia Spazio
`
`Journal of Computational Chemistry. Vol. 19. No. 7. 635—694 (1998}
`CG} 1993 John Wiley 3: Sons, Inc.
`
`CCC 0192-8551 lQB l 070685-10
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 5
`Petitioner Microsoft Corporation - EX. 1067, p. 5
`
`
`
`ROCCATANO ET AL.
`
`
`
`Introduction
`
`C lassical molecular dynamics (MD) is used to
`
`study the properties of liquids, solids, and
`molecules.‘*2 The Newton equation of motion for
`each particle of the system is solved by numerical
`integration and its trajectory is obtained. From this
`microscopic point of view, many microscopic and
`macroscopic properties can be obtained. The need
`for numerical integration limits the time step to
`the femtosecond scale and makes MD simulation a
`very time consuming taSk. Therefore, considerable
`efforts have been concentrated on optimizing MD
`codes on parallel computers of different architec-
`tures.
`
`Parallel computers are frequently described as
`belonging to one of two types: single instruction
`stream multiple data stream (SIMD), or multiple
`instruction stream multiple data stream (MIMD).
`In general, SIMD machines have a simpler archi-
`tecture, but
`they have hardware limitations be—
`cause the same instruction is executed in parallel
`on every SIMD processor and, furthermore, some
`SIMD machines do not have local addressing; that
`is, the processors are not able to access their own
`memory using different local addresses. In recent
`years, several MD codes have been implemented
`on MIMD architectures with a
`few dozen of
`processors"5 and, more recently, also on 100~ to
`1DOO—processor MIMD machinesfi's Parallel imple-
`mentations of biological MD programs such as
`CHARMM9 and GROMOS‘” on MIMD machines
`have been discussed in the literature.5' '1' "‘3
`Less work has been done using SIMD
`systems.”'” In general, they make use of the full
`connectivity computation;
`that is, all atom pair
`interactions are calculated, and are useful
`for
`long—range force systems. This is due to the diffi-
`culty of using pair lists of nonbonded atoms on
`SIMD machines with no local addressing.
`In the present study we propose an algorithm
`that permits the use of pair lists in a MD code for a
`SIMD machine with no local addressing. The algo-
`rithm requires simultaneous use of multiple time
`step115 and geometric decomposition” methods. In
`addition,
`the systolic loop'E’ method is used to
`further reduce computation time.
`The method was tested on Quadrics com—
`putersfg'21 a class of SIMD machines developed
`by INFN and Alenia Spazio, for configurations of
`32, 128, and 512 processors. Quadrics is the only
`
`massive parallel computer developed with fully
`European technology. As
`the Europort2/PACC
`project” has shown,
`the scalability for a MD
`code on MIMD architecture, for complex systems
`such as a protein in solution, is generally satisfac-
`tory only up to 12-16 nodes.
`Moreover, there are interesting projects being
`undertaken on mixed architecture MIMD/SIMD
`machines that could supply the computational
`power of a SIMD machine, together with the flexi-
`bility of a MIMD. It
`is therefore worthwhile to
`determine whether these machines are able to per-
`form such calculations.
`The following molecular systems have been
`used as tests:
`
`-
`
`-
`
`-
`
`-
`
`System 1: Box of 1536 water molecules (4608
`atoms).
`
`System 2: Box with a BPTI (bovine pancreatic
`trypsin inhibitor) molecule and 2712 water
`molecules (8704 atoms).
`
`System 3: Box with a SOD (superoxide dis—
`mutase) dimer and 4226 water molecules
`(15,360 atoms).
`
`System 4: Box with a SOD (superoxide dis—
`mutase) dimer and 9346 water molecules
`(30,720 atoms).
`
`It should be noted that system 4 is nearly the same
`as test case 13, used as the industrial benchmark in
`the framework of the EurosportZ/PACC project“
`(system 4 has 9346 water molecules whereas test
`case I3 has 9436 water molecules). The results
`show that the speed-up of the algorithm is compa-
`rable to those obtained with MIMD machines.
`
`___——————————'
`
`Hardware
`
`We tested the method on a Quadrics machine,
`Alenia Spazio’s supercomputer derived from the
`APElOO (Array Processor Elaborator) project, de—
`veloped by II\IFN.W'2' These computers are a fam-
`ily of massively parallel SIMD machines with im—
`plementations from 8 to 2048 processors. The
`biggest implementation allows a peak computing
`power of 100 GFlops in single precision (32-bit
`processors).
`The processors are arranged in a threewdimen-
`sional (3D) cubic mesh and can exchange data
`with the six neighboring nodes, with periodic
`boundaries. Each processor board contains eight
`processors (floating point units) with their own
`
`
`VOL. 19, NO. 7
`
`636
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 6
`Petitioner Microsoft Corporation - EX. 1067, p. 6
`
`J
`
`
`
`memory (4 megabytes). Up to 16 boards can be put
`into a crate. Configurations with more than 128
`rocessors are made up connecting crates of 8 X
`4 X 4 (128) nodes.
`The Quadrics controller board contains one inte-
`ger CPU (Z—CPU), which controls the flux of the
`program and the integer arithmetic. The language
`used is called Tao, a Fortran—like high—level paral-
`lel language, which can be modified and extended
`through a dynamic ZZ parser. The Quadrics ma-
`chine is connected to a host computer (a Sun
`Sparc-1O or -20). A host interface based on a HIPPI
`standard, which allows an l/O speed between the
`host and Quadrics of 20 MB /s, has recently been
`deVeloped. The tests on the sequential machine
`have been run on a DEC—alpha 3000 / 500 machine.
`Barone et al.22 compared the accuracy of Quadrics
`in the field of molecular dynamics with that of a
`conventional computer to assess the limits of the
`single precision.
`
`
`Molecular Dynamics
`
`In a molecular dynamics simulation, the classi—
`cal equations of motion for the system of interest
`are integrated numerically by solving Newton’s
`equations of motion:
`
`dzr,
`no? = "V;V(l’1,l‘2,... ’ F”)
`
`The solution gives the atomic positions and veloci-
`ties as a function of time. The knowledge of the
`trajectory of each atom permits study of the dy-
`namic or statistical properties of the system. The
`form of the interaction potential is complex and it
`includes energy terms that represent bonded and
`nonbonded (van der Waals and Coulombic) inter-
`actions:
`
`V(r],r2,...,rN)
`1
`1
`= Z EKblb—buler Z Ems—9012
`bonds
`angles
`
`'1
`+ z E Kgl g — ‘50]
`improper
`dihedrals
`
`2
`
`+ Z Kg[1+ cos(mp — 8)]
`dihedrals
`
`+ E
`
`pairsii,jl
`
`
`
`48”
`
`
`
`12
`0-1..
`
`1'12
`
`‘l
`
`_
`
`b
`
`0-!“
`[Eng
`l
`r
`r
`
`r9- ) + 47rsr--
`
`U
`
`U
`
`PARALLEL MOLECULAR DYNAMICS CODE
`
`The first four terms represent the bonded poten-
`tial. b, bu, and K,, are the actual bond length, its
`reference value, and the bond stretching force con-
`stant, respectively. 6, 60, and K” are the actual
`bond angle,
`its reference value, and the angle
`bending force constant, respectively, g, g0, and K5
`are the actual improper dihedral angle,
`its refer;
`ence value, and the improper dihedral angle bend-
`ing force constant, respectively.
`(p, Kw n, and 6
`are the actual dihedral angle, its force constant, the
`multiplicity, and the phase, respectively. The last
`term in the equation includes the nonbonded, van
`der Waals, and Coulombic terms. a” and 0'”- are
`the dispersion well depth and the Lennard—Iones
`distance,
`q,- and q}- are the electrostatic atomic
`charges, 1’” is the distance between them, and e is
`the dielectric constant.
`
`The time step used for the numerical integration
`is in the femtosecond scale. The highest frequency
`of bond vibrations would require a time step 5 0.5
`fs; however,
`if the simulation is performed with
`constant bond lengths, the time step can be 3 2 fs.
`For this reason, many MD codes perform simula"
`tions with constant bond lengths.
`The most frequently used algorithm to perform
`MD Simulation at constant bond lengths is the
`SHAKE algorithm based on
`an iterative
`procedure.23
`
`Computational Algorithm for
`Nonbonded interactions
`
`In a MD program, the calculation of the non-
`bonded forces is the most time-consuming task—~in
`fact, it takes about 90% of the computational time,
`depending on the protocol used.
`One of the most frequently used techniques to
`reduce the number of nonbonded forces is the
`cut-off criterion. With this method the interactions
`
`between atoms beyond a cut—off distance are ne-
`glected. If the cutoff radius is appropriate the lost
`energy contribution to the global potential is small.
`During a small number of steps the pairs of inter—
`acting atoms are considered to remain the same so
`that it is possible to create a list of these interac—
`tions, the nonbonded pair list, which will be up-
`dated every ” steps (1'?
`is generally equal to 10).
`The number of nonbonded interactions is N(N —
`
`is
`it
`1)/2 {N is the number of atoms), so that
`proportional to N2. The use of the cut-off criterion
`reduces this number to kN (it is a constant).
`
`JOURNAL OF COMPUTATIONAL CHEMISTRY
`
`68'?“
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 7
`Petitioner Microsoft Corporation - EX. 1067, p. 7
`
`
`
`ROCCATANO ET AL.
`
`the cut-off criterion is not di-
`Unfortunately,
`rectly applicable to SIMD architecture, as the same
`instruction is executed at each time on each pro-
`cessor and, consequently, it is not possible to have
`a local branch (the cut—off condition) in the pro-
`gram flow. Moreover, on Quadrics it is not possi-
`ble to have a local pair list on each node because
`local addressing of arrays is not possible. This
`explains the lower level of efficiency of a MD code
`on a SIMD machine with respect to a MIMD one.
`We have recently developed an algorithm,
`the
`global cut-off algorithm (CGA), based on a proba-
`bilistic approach, which allows the use of the cut-
`off condition on a SIMD machine with no local
`
`addressing.
`Because the calculation of bonded interactions
`and the integration of the trajectory take a small
`amount of the total calculation time, in the present
`version we have chosen to carry them out on the
`front—end computer and to perform only the calcu-
`lations of the pair list and of the nonbonded forces
`using the SIMD machine. It is, of course, possible
`to perform all force calculations and integration in
`the parallel machine using, for instance, the repli-
`cated data method.B
`
`GEOMETRIC DECOMPOSITION
`
`The assignment of the atoms to the nodes is
`obtained by a dynamically geometric decomposi-
`tion13 in such a way that
`the same number of
`atoms is assigned to each node. In what follows,
`we discuss a decomposition for a bidimensional
`case; the extension to a third dimension is straight-
`forward: given the bidimensional box of Figure 1a
`and a 2D parallel tepology of n = n“. X it" proces~
`sors, with n, = lip, = 2,
`the box is first" divided
`into it“. boxes along the .t-axis, as shown in Figure
`1b, each containing the same number of atoms.
`Each box is successively divided into It” boxes
`along the y-axis in such a way that each one of the
`:1, X a I, boxes contains the same number of atoms
`(Fig. 1c}. When, as in a real case, a third dimension
`exists, a successive diviSion along the z-axis has to
`be performed.
`It is obvious that, before performing any divi-
`sion along a given axis, it is necessary to sort the
`atoms of each box along that axis.
`The density of a molecular system, such as a
`protein,
`is not uniform;
`thus,
`the boxes do not
`have the same axis lengths. However, these differ—
`ences do not significantly reduce the efficiency of
`the GCA described in what follows.
`
`
`
`FIGURE 1. Domain decomposition of the molecular
`system in boxes with the same number of atoms, for a
`bidimensional case.
`
`SYSTOIJIC LOOP ll'lETl-IOD
`
`Quadrics topology makes it possible to use
`a systolic loop to calculate the nonbonded inter—
`actions between the atoms assigned to the dif-
`ferent nodes. The systolic loop method is one of
`the most efficient algorithms for calculation of
`two-body interactions on MIMD and SIMD
`macliines.”'m'“'35 The systolic loop algorithm
`passes the coordinates of all atoms around a ring
`of P processors in P/2 steps, such that half of the
`coordinates passes every processor exactly once
`(transient atoms). Each node also stores the coordi-
`nates of a group of atoms of the overall system
`(resident atoms). During the systolic cycle each
`processor evaluates and accumulates the interac-
`tions of the resident atoms with the transient ones.
`Only half of the atoms have to pass in each com-
`putational node as a consequence of the reciprocity
`of the interactions.
`
`The Systolic loop path for a 32-node Quadrics
`machine is shown in Figure 2. This machine has
`two nodes along the y and 2 directions and eight
`along the .r direction.
`The geometric decomposition of the system per-
`mits limitation of the search for nonbonded inter—
`
`actions only to the neighboring processors nearer
`than the Cut-off radius, so that, depending on the
`number of nodes and on the system size,
`it
`is
`generally not necessary to perform the complete
`systolic loop. The computed forces are passed back
`to the owning processor to accumulate the full
`force.
`
`GLOBAL CUT-OFF ALGORITHM
`
`On a SiMD machine, all nodes simultaneously
`evaluate an interaction, but the atom pairs in each
`node are different. Figure 3 shows, as an example,
`
`
`
`688
`
`VOL.
`
`‘19, NO. 7
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 8
`Petitioner Microsoft Corporation - EX. 1067, p. 8
`
`l'..
`
`
`
`
`
`RIGHT (x)
`
`FRONT (y)
`
`FIGURE 2. Systolic loop path tor node (0, 0, 0) of a
`32-node Quadrics machine. The transient groups of
`atoms visit only four neighboring y—z planes, based on
`Newton‘s third law.
`
`the case with four nodes: suppose that each node
`is evaluating the interaction i‘ii; in this case, all
`it;
`interactions fall within the cut-off radius. When
`
`the interactions are of the b,- type all the distances
`fall outside the cut-off radius and the interactions
`
`b,- are skipped. In the case of interactions of type c,-
`the interaction is outside the cut-off radius in nodes
`
`1, 2, and 3, but it is inside the cut-off radius in
`
`
`
`FIGURE 3. Different types of interactions in a case of
`tour nodes. a,-, all the interactions fall within the cut-off
`distance; b,-, all the interactions fall outside the cut-off
`distance; 9,. one interaction falls within the cut-oft.
`whereas three fall outside the cut-off. P.U.= processor
`unit.
`
`PARALLEL MOLECULAR DYNAMICS CODE
`
`node 4, so that all nodes have to calculate this
`
`interaction and only will be saved in the forces
`calculation. If the atoms in each node are ordered
`
`the interactions of type c;
`randomly,
`being the most frequent.
`The basis idea of the global cut-off algorithm
`(GCA) is to maximize the Occurrence of interac-
`
`result
`
`in
`
`tions of type a,- and ii,- and, conversely, reduce the
`occurrence of interactions of type q. To this end, it
`is necessary that the atoms in all nodes are sorted
`with the same criterion. Different types of sorting
`give comparable results. We have chosen the one
`shown in Figure 4. After this sorting procedure, a
`list of the interactions of type ril- and c,- is created
`in the integer CPU (Z-CPU) of the SIMD machine.
`This list
`is equivalent, but not identical,
`to the
`nonbonded pair list used in most MD programs
`and will be referred as the 'itonboiiiicd pair iist.
`The ordering procedures for the domain decom—
`position and the sorting procedure previously de—
`scribed are time conSurriing and have to be per—
`formed on the host serial machine; however, as
`
`will be shown, they have to be performed every
`100 to 200 steps so that they do not significantly
`affect the global computation time.
`The global cut—off condition is based on a proba-
`bilistic approach, so that the number of pair inter-
`actions to be calculated is larger than the actual
`number of pairs included within the rCLLI distance.
`Depending on the molecular system and on the
`number of nodes, the ratio between the number of
`the calculated interactions and the number of in-
`
`teractions actually included within the cut-Off dis-
`
`
`
`FIGUHE 4. Sorting 01 atoms in each node for a
`bidimensional case. The atoms are represented as full
`circles.
`
`
`
`JOURNAL OF COMPUTATIONAL CHEMISTRY
`
`639
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 9
`Petitioner Microsoft Corporation - EX. 1067, p. 9
`
`
`
`ROCCATANO ET AL.
`
`tance varies from the three to five. In this sense,
`the GCA is not very efficient. However, it must be
`noted that almost all pair interactions within a
`distance rmt < r 3 run + Ar, with Ar 3 0.2 nm,
`are calculated. As an example, given rm, = 0.6 nm,
`94% of the interactions in the range 0.6 < r s 0.8
`nm are calculated and only 6% of pairs in this
`range are lost. This suggests adoption of a cut—
`off value of i'GCA, to be used in the global cut—off
`condition, somewhat shorter than the actual cut-off,
`rm, desired for the simulation. In the previous
`example, with rcJCA = 0.6 and rm, = 0.8 nm,
`the
`number of pairs to be calculated is roughly two—
`to—threefold larger than the actual number of pairs
`within the rm, distance. The remaining 6% of
`interactions in the range 0.6 < i' g 0.8 nm have to
`be calculated separately.
`It
`is well known that nonbonded forces vary
`more slowly than the bonded ones. Moreover, non-
`bonded forces at large distances vary slower than
`nonbonded forces at short distances. This suggests
`updating of the forces at different steps, according
`to their nature (bonded and nonbonded) and to the
`
`distance of the interaction. The short-range interac—
`tions can be evaluated every step, and long-range
`interactions every in steps. Accordingly, the few
`interactions in the range 0.6 < r s 0.8 nm that
`were lost using "ocA = 06 run can be updated
`every :11 steps. As these interactions are calculated
`while evaluating the imiibondcd pair list (i.e., up-
`dated every 11 steps), we have chosen an = Ii = 10.
`It must be noted that there are now two shells:
`an inner shell (r g 0.6 nm) and an outer shell
`(0.6 < r g 0.8 um). All interactions are evaluated
`every in steps, whereas only those interactions
`corresponding to the inner shell are evaluated ev—
`ery step. It is therefore not necessary to have a skin
`distance and to store a list of atom pairs outside
`the outer shell.
`
`In the present study it is seen that most interac—
`tions in the range 0.6 to 0.8 nm are evaluated every
`step and only a few of them are evaluated every in
`steps. According to all of the MTS algoritluns, this
`choice does not affect the numerical accuracy; in
`fact, the same accuracy is obtained when an inter-
`action, within the outer shell,
`is evaluated every
`short time step or every long time step. However,
`as every long time step all interactions within the
`outer shell are evaluated, it is possible to perform
`the MTS according to the classical procedure; that
`is, by collection all
`the interactions within the
`outer shell at every long time step and collecting
`only the interactions within the inner shell at every
`short time step. Among several algorithms pro-
`posed for the multiple time step (MTSll'lg'Z‘c’ we
`have chosen the one developed by Scully and
`Hermans.13
`It must be noted that all nonbonded interactions
`
`(van der Waals and Coulombic) between bonded
`and nonbonded atoms are calculated in this step,
`but the interactions between bonded atoms are not
`
`saved. This is obtained by attaching to each atom a
`list of the atoms bonded to itself. This procedure is
`certainly not efficient, but
`the time required to
`perform it is negligible. In the following tests, the
`values of to“ and rm, are fixed at 0.6 and 0.8 nm,
`respectively.
`
`
`
`Results
`
`Table I shows the number of interactions within
`the cut—off radius rm, compared with the number
`of interactions to be evaluated with the GCA. It
`can be noted that the number of interactions for
`calculation is two to three times the actual munber
`of interactions within the cut—off radius. The time
`
`
`
`TABLE [.
`Number of Actual Interactions, N, within a Cut-Off Radius (rm) = 0.8 nm Compared with the Number of
`Interactions Calculated Using GOA on 32-, 128- and 512-Node Quadrlcs Machines.
`
`
`N
`32 nodes
`128 nodes
`
`512 nodes
`
`System 1
`(4608 atoms)
`System 2
`(8704 atoms)
`System 3
`(15,360 atoms)
`System 4
`(30,720 atoms)
`
`488,847
`
`891,252
`
`1,487,054
`
`706,944
`
`1,612,864
`
`3,153,344
`
`3,129,972
`6,754,240
`
`
`840,320
`
`1,810,944
`
`3,768,576
`
`8,249,216
`
`873,984
`
`2,070,016
`
`4,452,864
`
`9,938,944
`
`
`690
`VOL. 19, N0. 7
`
`Petitioner Microsoft Corporation - Ex. 1067, p. 10
`Petitioner Microsoft Corporation - EX. 1067, p. 10
`
`
`
`required for performing the MD simulation with
`the present code is the sum of the following steps:
`
`1. Ordering procedure (performed every it
`steps by the host computer).
`
`2. Calculation of the nonbonded pair list (per-
`formed every'rn steps by Quadrics).
`3. Calculation of the nonbonded fo