`
`Petitioner Microsoft Corporation - Ex. 1042, p. 269
`
`
`
`dedicated to the scanning of GSDB. This algorithm
`runs on a new kind of massively parallel and low cost
`(ie the price of a. workstation.) computer now avail—
`able; the reconfigurable hardware systems. It is based
`on a. bit level operation model that enables the imple—
`ment-or to fit the hardware to the problem rather than
`distorting the problem to fit the computer.
`The main idea that has guided our work is : if we sig-
`nificantly increase the speed of an algorithm for molec-
`ular genetics we also increase the quality of the re
`sults produced by this algorithm.
`Indeed decreasing
`the computation time needed by any algorithm by two
`or three orders of magnitudes enables the statistical
`validation by Monte Carlo like methods, for instance.
`
`This paper is organized as follows. The first section
`introduces the types of algorithms used for scanning
`the GSDBS, then proposes a generic pattern including
`those from FAST, BLAST and FLASH.
`ln the second
`section. after a brief introduction of the reconfigurable
`hardware concept, we show how an exhaustive search
`based on the pattern previously described can be effi-
`ciently implemented. Finally, comparative results be-
`tween other molecular genetic dedicated systems and
`our own are presented and discussed.
`
`Scanning Algorithmic
`A great variety of algorithms have been written to com—
`pare or align genomic sequences. However, a small
`number of mechanisms are at the heart of most of these
`
`algorithms.
`We have restricted our interest to the search for in-
`
`formation in the genetic data base. To simplify, this
`comes to comparing a sequence or part of a sequence
`to all the sequences of the data base.
`We have examined how to implement the common
`basis of these algorithmics on a reconfigurable hard-
`ware. As the biologists have associated a semantic to
`their current methods, one of the goals of our proposal
`is to speed up their methods without breaking up the
`semantics.
`
`The two principles of sequence comparison are the
`search for consensus patterns and the alignment by dy-
`namic programming. These methods consist in ofl'ering
`aligned sequences in such a way that the anchor points
`overlap each other. The anchoring points are set either
`by the biologist or by the pattern matching algorithm.
`
`Dynamic Programming
`Dynamic programming dedicated to molecular genet—
`ics was rediscovered at the begining of the Seventies
`by Needlman and W'unsh, then improved by Sellers,
`Gotoli, Smith and Waterman. (Needlmann 85 Wunsch
`1970; Sellers 1974; Gotoh 1982; 'Watermann 1984)
`Dynamic programming computation is extremely
`time consuming,
`therefore many dedicated machine
`projects have arisen. (Gokhale of at.
`199]; Chow et
`at 1991; White et at. 1991}
`
`270
`
`[SM 3—94
`
`We have evaluated an implementation of the N eedle
`man and Wunseh algorithm on our hardware. We find
`that a. scanning speed of only 3 million nucleotides per
`second is poSsible. Moreover, the computation only
`takes place on a band stretching along the main diag—
`onal of the matrix. This adds up to an under optimal
`computation heuristic on the whole of the matrix.
`On its own, dynamic programming remains too com—
`puting power consuming to search efficiently in large
`data bases, even with a speed up from specialized sys
`terns. This is even truer as several rounds are necessary
`to validate the results. Therefore. another kind of al-
`gorithm that deals better with the trade ofl‘s between
`scanning speed, flexibility and cost (in development
`time and finance) must be used for a specialized ma-
`chine. The pattern matching algorithms fulfills these
`requirements.
`
`Pattern Matching
`
`FAST, BLAST and FLASH are three examples of pattern
`matching algorithms, they are significant because of
`the extent to which they have been distributed. The al-
`gorithmic choice on which they were based were guided
`by computing time considerations, in terms of length
`of the data base, but also the nature of patterns, their
`number and length. They illustrate the state of the art
`on how to obtain the best performances on sequential
`machines. They establish a hierarchy in the structure
`of the words searched for. The differences between al—
`
`gorithms lie mainly in the way they look for a fragment,
`as well as the law of acceptance or rejection.
`These three algorithms are based upon the splitting
`up of the pattern searched for into small segments. In
`the case of FLASH this is even done in the base.
`To summarize the characteristics of these three al-
`
`gorithms they allow:
`
`I A comparison of a sequence against a base of se—
`quences.
`
`I A splitting up of the sequence searched for into
`smaller pieces. They are continuous, of size two for
`FAST, four for BLAST and discontinuous for FLASH.
`
`I A search based upon lookup tables.
`
`I rl'he filtering by the means of a minimum score of
`the sequences possessing the most fragments.
`
`Califano and Rigoutsos‘s paper (Califano Se Rigout-
`505 1993) provides an excellent statistical evaluation of
`the BLAST and FAST pattern definition, compared to
`their own in FLASH. One of the interesting points dis-
`played by Califano and Rigoutsos is that the increase
`in sensitivity of the method is related to the nature of
`the fragments, this respects our intuition.
`Besides.
`it is better to use discontinuous segments
`to characterize the information that is contained in the
`
`sequence and thus optimize the search.
`The term of segment has become inappropriate be-
`cause inserting spaces in the fragments has made them
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 270
`Petitioner Microsoft Corporation - EX. 1042, p. 270
`
`
`
`become more complicated, therefore we prefer to speak
`about a motif. The motifs are still very simple in the
`case of FLASH but we intend to complexify them.
`
`The Generic Motif
`
`In what follows we summarize the most important
`characteristics of a pattern matching algorithm.
`
`I The scanning of several hundreds or even thousands
`of motifs. Moreover, a degree of flexibility in the
`description of the motif is required to stretch the
`algorithm’s research field. This implies the motifs
`must not only be words but rather signatures of their
`belonging to a class of words.
`
`I Sufficient velocity. This implies a few seconds to scan
`current bases, as in the near future the sequencing
`aims at entering whole genomes that would increase
`the volume to several billions of nucleotides (3.5 bil-
`lion of nucleotides for the entire human genome).
`
`The motif isn’t described any more from an alpha-
`bet of nucleotides or amino acids even extended with
`
`joker characters, but from a list of authorized letters at
`a position. A motif becomes an array of lists of letters.
`The motifs must also be rather short, but this isn’t
`critical. Indeed, Califano and Rigontsos have demon—
`strated that the motif must have a length around 10
`for the search on DNA and 5 for the proteins, to obtain
`a scanning with a good sensitivity.
`In order to take into account these considerations.
`the motifs are looked for with vectors of binary sub~
`stitution. The motifs are formed by a set of Binary
`Substitution Vectors (BSVs). A st is simply the list
`of authorized letters at the position of the vector. A
`motif of length N on an alphabet of L letters is made
`up by a list of N BSV ofL bits.
`
` Alphabet
`
`l
`0
`0
`
`(l
`1
`0
`
`l
`l
`l
`
`1
`0
`0
`
`0
`0
`0
`
`0
`0
`1
`
`T
`G
`C
`
`l
`0
`0
`1
`
`0
`0
`1
`0
`
`This motif of length 8 on the DNA alphabet can recognize
`words like AGTGCTAA or CGTGATAC and so on...
`
`This motif structure has several advantages. On one
`hand it encompasses the FAST, BLAST and FLASH mo~
`tifs and at the same time it maintains a certain sim—
`
`plicity and a direct access to the underlying seman-
`tic. This makes the production of motifs by a program
`quite easy. On the other hand, these motifs can be
`used as anchor points for multiple alignment (Gracy,
`Chiche, & Sallantin 1992), searching by profile, or for
`producing examples for learning systems. The latter
`exploits the possibilities of systematic exploring with
`motifs of the size of a data base to find consensus fields
`
`in a set of sequences. (Mephu N guifo & Sallantin 1992)
`To adress the problem of sensitivity of the search, let
`us first notice that it depends only of the discriminant
`power of the motif. We have set the maximum length
`
`Indeed the most specific
`of a motif to be 16 sts.
`motif (ie with only one letter at each position) 1will be
`found, in average, once in a random sequence of length
`416 x 4.109, which is greater than the size ofthe human
`genome. Thus the selectivity of the motif can be tuned
`from a probability of one (ie the motif match with any
`sequence in the data base) to a probability less than
`(data base length)‘1. Furthermore the selectivity can
`be tuned independentely of the efficiency of the search.
`
`AER? Algorithm and Its Implementation
`The A2132 algorithm aims at searching exhaustively
`on the whole of a GSDE, for the situation of words or
`short sequences. The motifs looked for, described in
`the previous section, can be established either by the
`user, or automatically by a. software.
`
`Introduction to Reconfigurable Hardware
`
`Switch Matrix
`
`e-IZIEIIIIIZIEEIZ
`
`
`
`IOBfl--s-I:l|:ll:ll:|l__ll_|l__lmf“'ll_ll:
`
`Figure I: View of part of Xilinx FPGA architecture
`
`The reconfigurable hardware concept, as well as its
`implementation with Field Programmable Gate Ar—
`rays (FPGAs) was introduced by different teams at the
`end of the Eighties. Vuillernin, Lopresti and Kean de—
`scribed various systems based on this concept. (Bertin,
`Roncin,
`Sc Vuillemin 1989; Gokhale et at.
`1991;
`Gray 5:, Kean 1989) In fact, this concept had been
`latent in the literature since the Seventies. One of
`
`the first to have expressed it was Schafiner (Schaffner
`1978) in 1972. However, it is only with the arrival
`of the SHAM based FPGA in 1985 by Xilinx that it
`became possible to implement this concept. The char-
`acteristics of reconfigurable hardware are, a great de-
`velopment facility, together with a flexibility that is
`only encountered in programmable systems. The level
`of performances reached by a reconfigurable hardware
`are those of a dedicated system. The FPGA is mainly
`used because of its aptitude for rapid prototyping.
`
`Lemoine
`
`271
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 271
`
`
`
`ACK
`
`
`
`match-mismatch lines
`
`pipeline bus : DNA 2bits, Protein 4 or 5 bits
`
`Figure 2: Scheme of a single motif detector
`
`Basically a FPGA is a grid of small memories 16x1
`bits interconnected by a network across the whole grid.
`These small memories are lookup tables that can im—
`plement any logic functions of four variables.
`In the
`FPGA that we used,
`the lookup tables are grouped
`by two with two additional 1 bit registers and form a
`CLB (Configurable Logic Block) the basic element of
`the Xilinx FPGA. The reader must refer to (Fagin &
`Watt 1992) for a comparison between supercomputer,
`ASIC2 and FPGA based hardware.
`
`The Experimental Platform
`
`The experimental platform is constitued by a DEC
`workstation and a co—processing card attached to it.
`This card contains 22 Xilinx FPGAs and 4 MBytcs of
`SRAM. 16 FPGAS are connected in a QD array with 10—
`cal interconnection and memory buses. Therefore 5120
`3 binary functions, equivalent to 1 bit arithmetic and
`logic unit, can be evaluated‘ at each cycle. in the ar-
`ray. The maximum bandwidth between this array and
`the memory is 320 MBytes/s with an access time from
`SRAM to FPGAs of Wm. The hardware is connected
`
`to a Decstation 5000/2-10 by a bus TurboC-hannel at
`100 MBytes/s.
`On this UNIX workstation the design is developed
`in C++ and translated through Xilinx tools in a bit-
`stream configuration ready to be loaded on the FP-
`GAs. The bitstream configuration can be considered
`as a unique nanoinstruction of 1.4 Mcgabits width that
`loads itselfin less than 50 ms on the card.(Touati 1992')
`
`The implementation on a Reconfigurable
`Hardware
`
`Let us now examine how the detection of a given motif
`is translated into the hardware. The key point is to fit
`a Binary Substitution Vector with a lookup table. Then
`
`2Application Specific integrated Circuit
`316 x 320 : 16 FPGA, 320 CLBs per FPGA
`
`272
`
`ISM 3—94
`
`a 4x1 (20):!) bit lookup table implements a nucleotide
`[amino acid) BSV. This is due to the number of bits
`necessary to code the nucleotide or amino acid alpha—
`bet. As we have already pointed out one GLB contains
`two 16x1 bit lookup tables. One CLB can implement
`two nucleotide BSV or one amino acid as" by bringing
`together each of their lookup tables of 16 bits. A motif
`is made up of a certain number of RSV therefore a mo-
`tif detector is made up of several CLBs whose outputs
`are combined by an AND gate. See figure 2.
`In fact.
`many optimisations can be made in order to increase
`either the speed or the number of motif detectors. But
`we don't intend to bother the reader with too many
`electronics details of the implementation.
`The data base is injected nucleotide by nucleotide
`sequentially and is pipelincd in the FPGA in such a
`way that a new word can be presented at each cycle
`to the motif detector. The pipeline can be seen as a
`window that moves along the sequence. Once a motif
`has been recognized. a signal is sent to the best. station.
`The number associated with the motifand its position
`in the data base are the relevant informations sent to
`the host station.
`
`The whole design can be considered as a sieve in
`which each hole fits a motif. The number of motifs that
`
`can be looked for simultaneously on the card is 512. In
`the case of amino acids the motifs are up to 8 BSV longl
`whereas the nucleotide motifs are up to 16 35v long.
`The maximum Speed in this case is constraint by the
`FPGA’s intrinsic performance and is around 50ns per
`cycle.
`One must point out that the motifs are coded in hard
`in the FPGA design. This is not a. handicap since only
`the lookup tables must be changed. From afunctional
`point of view this comes to breaking up the whole mo-
`tif detector into elementary functions. For a given set
`of motifs that need to be matched. the only changes
`involved are those of the functions that code for the
`
`substitution vectors. The overall function decompo—
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 272
`Petitioner Microsoft Corporation - EX. 1042, p. 272
`
`
`
`sition is not changed. This can be done by writing
`straight into the bitstream configuration of the FP—
`GAs as the motifs are generated. The figure 3 presents
`the two phases of this process as it can be seen by the
`end user.
`
`If the search does not require more than 256 mo-
`tifs the design can run at the highest speed supported
`by the 1/0. A new character from the GSDE can be
`injected every 40 us into it. A flow of 25 million nu—
`cleotides per second is achieved. The parallelism in
`this case is massive, 4k comparisons per cycle which
`represents 100 Giga. operations—1. To compare these
`results with a sequential machine, a program obtain-
`ing exactly the same results has been implemented on
`a DEC station. The most favorable situation for the
`sequential machine has been considered, that is the
`comparison where there is no matching at any posi-
`tion. This reduces the parallelism really useful to 250
`comparaisons per cycles. Eight instructions are car-
`ried out on the workstation to make a comparison and
`2000 instructions are necessary to execute what the
`hardware does in one cycle. Therefore, under the opti—
`mistic assumption of one instruction per cycle, a min-
`imum speed up of 2000 times is reached.
`In fact, on
`the basis of realistic data (ie not only mismatches) the
`speed of the reconfigurable hardware is more than 5000
`times that of the workstation.
`
`
`
`Figure 3: The two phases of a data base scanning
`
`Related Study and Discussion
`
`We present our dedicated hardware and software as
`well as four other specialized systems. Two were re-
`alized with full custom A310 and the two others with
`FPGA. For a fair comparison the reader has to take
`into account that the SPLASH 2 project uses F PGA
`of a more recent technology than the 3 others and than
`our hardware.
`In fact, we cannot compare these dif-
`ferent systems because each of them uses a particular
`algorithm and there is no common quality benchmark
`available for these 4 systems. The four systems imple—
`ment different algorithm with the same goal; search—
`
`ing for a sequence in a data base.
`
`I BISP : The BISP system consists of a Motorola
`68020, SRAM and EEPROM memories and 48 BISP
`chips in the full configuration. The BISP chips inte—
`grate 400000 transistors, 75% is implemented with
`full-custom logic, and runs at 12.5 Mhs for an 1 pm
`CMOS process. It implements 16 cells of systolic ar-
`ray conceived for dynamic programing. (Chow ct al.
`1991)
`
`I BioSCAN : The BioSCAN chips. 650000 transis-
`tors, integrates 800 cells of a systolic array. It runs
`at 32 Mhz but needs 16 cycles to execute one step
`of the systolic algorithm. A 1.2 pm CMOS process
`is used for the design technology. Only one chip is
`required for the scanning. (White et at 1991)
`
`I SPLASH l : is a ring of 32 X03090 interconnected
`with a hoststation through two FIFOs. 744 cells are
`implemented in a one dimensional mesh that runs at
`1 Mhz. (Gokhale et at. 1991)
`I SPLASH 2 : This new version of SPLASH looks
`like an SIMD supercomputer and is based on the
`new generation of FPGA from Xilinx. The perfor-
`mance indicated in the table 1 are for one board; 17
`FPGAs interconnected to their nearest neighbour in
`a ring fashion or through a crossbar. The algorithms
`are the same as SPLASH l; a kind of dynamic pro-
`grammation. (Hoang 1993)
`
`In order to take into account these differences we
`
`compared the five systems on the same basis. That is
`the time it. would take each system to search for a 500
`nucleotide DNA sequence in a data base. In table 1 one
`can see the number of chips required to implement the
`heart of the algorithms, the scanning speed of a data
`base in mega nucleotides per seconds, and the speedup
`over a workstation claimed by the authors.
`From this table it stands out that the reconfigurable
`hardware concept associated with our generic motif
`reaches and even over—takes the performances of a cus-
`tom ASIC. Moreover the comparison between [1ng
`and SPLASH shmvs that our implementation makes a
`better deal with the trade off between area efficiency
`and speed.
`
`Let us now examine the performances of FLASH in
`terms of scanning speed. Califano and Rigoutsos (Cal-
`ifano 35 Rigoutsos 1993) report they require 24 seconds
`to match a 100 nucleotides DNA sequence against the
`Genbank. For a similar Search with BLAST they report
`a time of 5 minutes. Our system does the same job in
`only 4 seconds.
`To compare our result to FLASH we have to take
`into consideration another parameter, the additional
`cost of a dedicated hardware to the cost of a worksta-
`tion versus the cost of a. workstation alone. However
`
`FLASH compiles the data base in a huge lookup table.
`For example in Genbank, 106 nucleotides are compiled
`
`Lemoine
`
`273
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 273
`Petitioner Microsoft Corporation - EX. 1042, p. 273
`
`
`
`
`
`_Systemname
`
`N b chips
`required
`
`Scanning
`speed
`
`Speed up
`over a. WS
`
`
`
`32 XCSOQO
`
`l
`
`10000 (vi)
`
`A2 R2
`250 motifs
`
`A2 R2
`500 motifs
`
`16 xcaeso
`
`5mm [vi]
`
`16 XC3090
`
`20M one
`
`.3“
`
`i) SparcStation I: 12.5 Mips. ii] Sun 4/310: 8 Mips.
`iii) Sparc lofsoox: so Mips vi) DECstation 5000,1240: 4t}
`Mips
`
`Table 1: Performances of Systems Dedicated to Molec-
`ular Genetics
`
`in a 9 Gigabytes lookup table that does not fit in the
`main memory of a workstation. The user of FLASH
`then needs an extra 9 Gigabytes of external storage
`and would require even more in the future. This is
`because the lookup table for the whole human genome
`will require around 0.5 ’I‘erabytes of disk storage. The
`price of that external storage is higher than the price
`for our hardware (equivalent to the price of a worksta~
`tion). Moreover the performances ofintegrated circuits
`continue to improve dramatically and the performance
`of the 1/0 subsystem has not kept pace. (Patt 1994}
`
`Algorithmic versus Dedicated Hardware One
`question remains : how far can hardware solutions go
`without new advances in the computational complex-
`ity of algorithms '3'
`In (Vuillernin 1993) Jean Vuillemin introduced the
`main equations one should bear in mind.
`
`P = G x F where G is the number of gates which
`one can effectively fit. within a unit area by the
`end of the year y, F is the frequency at which one
`can reliably operate such gates and P the comput—
`ing power of the resultant system. By the end of
`the next technology year yr : y + l, the comput—
`ing power increases according to G" : 03 G and
`F' = or! F. The last 25 years the overall-increase
`ing factor of the computing power has remained
`equal to two and by choosing t1 : (is, = (1;. a good
`approximation, we find 0' = {3/5.
`
`Let us now apply the above equations to :13}?2 and
`its implementation. The FPGA used are in 1.2;“?! 1990
`CMOS technology and the board was realized and lin—
`ished in 1992. On this experimental platform A”?
`operates at 25Mhz and uses 5000 time. So by the year
`
`2’74
`
`lSMB—94
`
`2000 with a 1998 FPGA technology we will implement
`A2182 on a or?” x G x 2.105 CLBs and operate at
`as x F m 150 Mhz.
`
`But all of the hardwares will not completely take
`advantage of this technology increase in performances.
`Jean Vuillemin demonstrated that reconfigurable hard—
`ware based system effectively grow by a factor two each
`year whereas general purpose microprocessors grow
`only by a factor of two each three years. This is because
`the. peak performance of a microprocessor is ultimately
`limitated by the frequency of the clock no matter the
`amount of gates4 that were used to make the micro—
`processor.
`
`The question is now whether current algorithms can
`exploit the whole hardware’s increase in computing
`power.
`
`If we focus on data base scanning we can see two
`kinds of parallelism (Miller, N adkarni, & Carriero
`1991). The first one is obvious, since the data base is
`composed of a huge amount of independent5 sequences
`we can divide the initial data base in many smaller data
`bases and search independantly in each of these data
`base. The other kind of parallelism can be found in the
`pattern matching algorithm itself. AER2 is an exam-
`ple of highly parallelized pattern matching algorithm.
`If we add the flexibility of reconfigurable hardware we
`can balance between two ways of using the increase in
`gates by the year 2000 :
`- Increasin
`only the number of motifs scanned :
`250 X 0:1 x 10000 motifs at a scanning speed of
`150Mnuc.s_1.
`
`I Increasing only the scanning speed of the data base
`for the actual numbers of motifs [250). Therefore the
`data base is split in ale m 40 data bases resulting
`in a 40 times 150 l't’lnuc.‘1 (ie 0 GIGAnucs—1 or
`scanning the entire human genome in half a second)
`To conclude, under the asumption that no break-
`through in integrate circuit technology will take place,
`a. compu ting power improvement of two orders of mag
`nitudc must be gained by the year 2000, thanks to the
`reconfigurable hardware. Whereas only an optimistic
`'5 times improvement should be gained by general pur-
`pose workstation in the saute time.
`
`Conclusion and Future Work
`
`We demonstrate that the couple formed by A2112 and
`reconfigurable hard ware system achieves very high per—
`formances. 1ndeed, a genomic Sequence data base of 1
`billion nucleotides is scanned in less than a minute.
`
`The following statements summarize the advantages
`of both 212R2 and reconfigurable system:
`I Efficiency, due to the massive parallelism achiev-
`able without any restriction on the quality of the
`results produced.
`
`the number of gates only control the ratio (average 3'
`peak) performances.
`5for the pattern matching algorithm
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 274
`Petitioner Microsoft Corporation - EX. 1042, p. 274
`
`
`
`I Generality, because the motifs structures include
`patterns previously used by FAST, BLAST and FLASH
`algorithms.
`
`I Simplicity, thanks to clear semantics, the motif can
`be generated by hand or automatically by a pro—
`gram.
`
`I Scalability, the increase by a factor two each three
`year in speed and a factor of four each three years
`in density of the new generation of FPGAs gives one
`matter to think about the future use of reconfig-
`urable hardware.
`
`I Flexibility, the motif detector can be seen as a.
`building block for more sophisticated pattern match-
`ing algorithms.
`
`One restriction remains, the quality of the result is
`directly connected to the motifs given to :1sz and
`therefore brought back to the user the responsability
`of the quality of the search.
`We think that the Speed sustained by our systems
`opens a new way for high level genomic analysis com-
`putation. Indeed, Artificial Intelligence algorithms be-
`cause of their nature need a huge amount of computing
`power. Usually the main drawback of such algorithms
`is the latency of data. base request. With A2 R2 we offer
`a solution to this problem as well as the possibility of
`exhaustive exploration of the current and future data
`bases. Moreover, the acceleration of the time consum-
`ing subroutine of pattern matching methods allows a
`better control of the result. Statistical methods, like
`Monte Carlo, can be used to validate knowledges ob-
`tained by learning methods in AI systems.
`We are now concentrating our efforts on speeding
`up the main part of AI algorithms produced by our
`team. The first candidate is the Galois lattice algo-
`rithms used in the LEGAL system. (Mephu N guifo Sc
`Sallantin 1992) And also implement in hardware qual-
`ity analyse of results produces by :4ng and other al—
`gorithms.
`
`Acknowledgements
`
`A great thanks to P. Bertin, C. Boegner, P. Boucard,
`II. Touati and J. Vuillemin for helping us along the
`way, and to the anonymous referees for their valuable
`suggestions.
`
`References
`
`Bertin, P.; Boneln, D.; and Vuillemin, J. 1989. Introduc-
`tion to programmable active memories. In Systolic Array
`Processors. J. McCanny et a]. 300—309.
`Califano, A., and Rigoutsos, I. 1993. FLASH: A fast look-
`up algorithm for string homology.
`In Pmcc. of the First
`Int. Conf. on Intelligent Systems for Molecular Biology.
`56—64. AAAI.
`
`Chow, E. T.; Hunkapiller, T.; Peterson, .1. Cu, and Zim-
`merman, B. A. 1991. Biological information signal proces-
`sor. In Proceedings of the Int. Conf. on Aplication Specific
`Array Processor, 144—160.
`
`Fagin, B., and Watt, .1. G. 1992. FPGA and rapid proto-
`typing technology use in a. special purpose computer for
`molecular genetics.
`In Proee. of the Int. Conf. on Com-
`puter Design, 496—501. IEEE.
`1985. Align—
`Feng, D., Johnson, M.; and Doolittle, R.
`ing amino acid sequences: comparison of commonly used
`methods. J. of Molecular Evolution 21:112—125.
`Gokhale, M.; Holmes, W.; Kasper, 21.; Lucas, 8.; Minnich,
`R.; Sweely, D.; and Lopresti, D.
`1991. Building and
`using a highly parallel programmable logic array.
`IEEE
`Computer 24(1]:81—89.
`Gotoh, 0. 1982. An improved algorithm for matching
`biological sequences. J. of Molecular Biology 162:705—708.
`Gracy, J.; Chiche, L.; and Sallantin, J. 1992. A modular
`learning environnment for protein modeling. In Procc. of
`the First Int. Conf. on Intelligent Systems for Molecular
`Biology, 145—153. AAAI.
`Gray, J. P.. and Kean, T. A. 1989. Configurable hardware:
`A new paradigm for computation.
`In Proceedings of the
`1989 Deeenniol Cultech Con}. C. L. Seits. 279—295.
`Hoang, D. T. 1993. Searching genetic database on splash
`2. In Proc. of the lVorkshop on FPGAJ as Custom Com-
`puting Machines, 185—191. IEEE.
`Lemoine, E. 1993. Reconfigurable hardware for molecular
`biology computing systems.
`In Procc. of Int. Conf. on
`Application-Specific Array Processors, 184—187.
`Mephu Nguifo, E., and Sallantin,
`.l. 1992. Prediction of
`primate splice junction gene sequences with a cooperative
`knowledge acquisition system. In Procc. of the Firsl Int.
`Conf. on Intelligent Systems for Molecular Biology. 292—
`300. AAAI.
`
`Miller, P. L.; Nadkarni, P. M.; and Carriero, N. M. 1991.
`Parallel computation and PASTA: confronting the prob-
`lem of parallel database search for a fast sequence com-
`parison algorithm. CABIOS T(5):71-—73.
`Needlmann, S., and Wunsch, C. 1970. A general method
`applicable to the search for sirnillarities in the amino acid
`sequence of two proteins. J. of Molecular Biology 48:443—
`453.
`
`Patt, Y. N. 1994. The I/O subsystem : a candidate for
`improvements. IEEE Computer 27(3):]5—16.
`Schaifner. M.
`1978. Processing by data and program
`blocks.
`IEEE Transactions on Computers 27(11):“115—
`1028.
`
`Sellers, D. 197-1. 0n the theory an computation of evo-
`lutionnary distances. SLAJI. J. Appl. Math. 26(4):?87—
`793.
`
`Touati. H. 1992. PerlelDC: a C++ library for the simu-
`lation and generation of DECPeRLe-l designs. Technical
`Report TN4, Paris Research Laboratory.
`Vuillemin. J. E. 1993. On computing power. Lecture Notes
`on Computer Sciences 782:69—86.
`Watermann, M. 1984. General methods of sequence com—
`parison. Bull. .Moth. Biol. 46:473-500.
`White, C. T.; Singh, R. K.; Reintjes, P. B.; Lampe, J.; Er-
`ickson, B. W.; Dettloif, W. D.; Chi, V. L.; and Altschnl,
`S. F. 1991. BioSCAN: A VLSI—based system for biase—
`quence analysis. In Proceedings of the IEEE Int. Conf. on
`Computer Design.
`
`Imoine
`
`275
`
`Petitioner Microsoft Corporation - Ex. 1042, p. 275
`Petitioner Microsoft Corporation - EX. 1042, p. 275
`
`