`Bartha et al .
`
`US 11,384,394 B2
`( 10 ) Patent No .:
`( 45 ) Date of Patent :
`Jul . 12 , 2022
`
`US011384394B2
`
`( 54 ) METHODS AND SYSTEMS FOR GENETIC
`ANALYSIS
`( 71 ) Applicant : Personalis , Inc. , Menlo Park , CA ( US )
`( 72 ) Inventors : Gabor T. Bartha , Los Altos , CA ( US ) ;
`Gemma Chandratillake , Cambridge
`( GB ) ; Richard Chen , Burlingame , CA
`( US ) ; Sarah Garcia , Palo Alto , CA
`( US ) ; Hugo Yu Kor Lam , Sunnyvale ,
`CA ( US ) ; Shujun Luo , Castro Valley ,
`CA ( US ) ; Mark R. Pratt , Roseburg ,
`OR ( US ) ; John West , Cupertino , CA
`( US )
`( 73 ) Assignee : Personalis , Inc. , Menlo Park , CA ( US )
`Subject to any disclaimer , the term of this
`( * ) Notice :
`patent is extended or adjusted under 35
`U.S.C. 154 ( b ) by 0 days .
`( 21 ) Appl . No .: 17 / 548,379
`( 22 ) Filed :
`Dec. 10 , 2021
`( 65 )
`Prior Publication Data
`Mar. 31 , 2022
`US 2022/0098662 A1
`
`( 56 )
`
`( 2019.02 ) ; G16B 99/00 ( 2019.02 ) ; C12Q
`1/6869 ( 2013.01 ) ; G16B 35/00 ( 2019.02 ) ;
`G16C 20/60 ( 2019.02 )
`( 58 ) Field of Classification Search
`None
`See application file for complete search history .
`References Cited
`U.S. PATENT DOCUMENTS
`9/2015 Bartha et al .
`8/2017 Bartha et al .
`4/2019 Bartha et al .
`9/2019 Bartha et al .
`10/2021 Bartha et al .
`1/2002 Goldsborough et al .
`2/2005 Perlin
`8/2006 Weiner et al .
`12/2006 Ruano
`2/2010 Moore et al .
`1/2011 Kain et al .
`( Continued )
`FOREIGN PATENT DOCUMENTS
`
`9,128,861 B2
`9,745,626 B2
`10,266,890 B2
`10,415,091 B2
`11,155,867 B2
`2002/0006615 Al
`2005/0042668 A1
`2006/0184489 Al
`2006/0278241 A1
`2010/0042438 Al
`2011/0009296 A1
`
`EP
`WO
`
`9/1988
`0281927 A2
`WO - 2011160063 A2 * 12/2011
`( Continued )
`
`A61P 35/00
`
`( 60 )
`
`Related U.S. Application Data
`Continuation of application No. 17 / 078,857 , filed on
`Oct. 23 , 2020 , which is a continuation of application
`No. 16 / 816,135 ,
`filed on Mar. 11 , 2020 , now
`abandoned , which is a continuation of application No.
`a
`16 / 526,928 , filed on Jul . 30 , 2019 , now abandoned ,
`which is
`continuation of application No.
`15 / 996,215 , filed on Jun . 1 , 2018 , now Pat . No.
`10,415,091 , which is a continuation of application
`No. 14 / 810,337 , filed on Jul . 27 , 2015 , now Pat . No.
`10,266,890 , which is a division of application No.
`14 / 141,990 , filed on Dec. 27 , 2013 , now Pat . No.
`9,128,861 .
`
`a
`
`2
`
`( 60 ) Provisional application No. 61 / 753,828 , filed on Jan.
`17 , 2013 .
`( 51 ) Int . Cl .
`C12Q 1/6874
`G16B 20/00
`G16B 30/00
`G16B 99/00
`G16B 20/10
`G16B 20/20
`G16B 35/10
`C12Q 1/6806
`C12Q 1/6869
`G16B 35/00
`G16C 20/60
`( 52 ) U.S. CI .
`CPC
`
`( 2018.01 )
`( 2019.01 )
`( 2019.01 )
`( 2019.01 )
`( 2019.01 )
`( 2019.01 )
`( 2019.01 )
`( 2018.01 )
`( 2018.01 )
`( 2019.01 )
`( 2019.01 )
`C12Q 1/6874 ( 2013.01 ) ; C12Q 1/6806
`( 2013.01 ) ; G16B 20/00 ( 2019.02 ) ; G16B
`20/10 ( 2019.02 ) ; G16B 20/20 ( 2019.02 ) ;
`G16B 30/00 ( 2019.02 ) ; G16B 35/10
`
`> >
`
`OTHER PUBLICATIONS
`Clark et al . , " Performance comparison of exome DNA sequencing
`technologies , ” Nat . Biotechnol . 2011 , 29 : 908-914 , with 2 pages of
`supplementary “ Online Methods ” . ( Year : 2011 ) . *
`Boers et al . , “ High - Throughput Multilocus Sequence Typing : Bring
`ing Molecular Typing to the Next Level , ” PLOS ONE 2012 ;
`7 ( 7 ) : e39630 .
`
`( Continued )
`Primary Examiner Kaijiang Zhang
`( 74 ) Attorney , Agent , or Firm — Orrick , Herrington &
`Sutcliffe LLP
`
`a
`
`( 57 )
`ABSTRACT
`This disclosure provides systems and methods for sample
`processing and data analysis . Sample processing may
`include nucleic acid sample processing and subsequent
`sequencing . Some or all of a nucleic acid sample may be
`sequenced to provide sequence information , which may be
`stored or otherwise maintained in an electronic storage
`location . The sequence information may be analyzed with
`the aid of a computer processor , and the analyzed sequence
`information may be stored in an electronic storage location
`that may include a pool or collection of sequence informa
`tion and analyzed sequence information generated from the
`nucleic acid sample . Methods and systems of the present
`disclosure can be used , for example , for the analysis of a
`nucleic acid sample , for producing one or more libraries , and
`for producing biomedical reports . Methods and systems of
`the disclosure can aid in the diagnosis , monitoring , treat
`ment , and prevention of one or more diseases and condi
`tions .
`
`19 Claims , 15 Drawing Sheets
`
`Foresight EX1001-p. 1
`Foresight v Personalis
`
`
`
`US 11,384,394 B2
`Page 2
`
`( 56 )
`
`References Cited
`U.S. PATENT DOCUMENTS
`
`2012/0077682 Al
`2012/0270206 A1
`2018/0051338 A1
`2021/0062258 Al
`2021/0238677 Al
`
`3/2012 Bowcock et al .
`10/2012 Ginns et al .
`2/2018 West et al .
`3/2021 Bartha et al .
`8/2021 John et al .
`
`FOREIGN PATENT DOCUMENTS
`
`WO
`WO
`
`WO - 2011160206 A1 * 12/2011
`WO - 2014113204 Al
`7/2014
`
`CO7K 14/705
`
`OTHER PUBLICATIONS
`Clark , et al . Performance comparison of exome DNA sequencing
`technologies . Nat Biotechnol . Sep. 25 , 2011 ; 29 ( 10 ) : 908-14 . doi :
`10.1038 / nbt.1975 .
`Co - pending U.S. Appl . No. 16 / 526,928 , inventors Barthagabor ; T. et
`al . , filed Jul . 30 , 2019 .
`Co - pending U.S. Appl . No. 16 / 816,135 , inventors Barthagabor ; T. et
`al . , filed Mar. 11 , 2020 .
`Co - pending U.S. Appl . No. 17 / 507,578 , inventors Barthagabor ; T. et
`al . , filed Oct. 21 , 2021 .
`Craig , et al . Identification of genetic variants using bar - coded
`multiplexed sequencing . Nat Methods . Oct. 2008 ; 5 ( 10 ) : 887-93 .
`Epub Sep. 14 , 2008 .
`European search report and opinion dated Aug. 4 , 2016 for EP
`Application No. 13871784 .
`Gottlieb , et al . The DiGeorge syndrome minimal critical region
`contains a goosecoid - like ( GSCL ) homeobox gene that is expressed
`early in human development . Am J Hum Genet . May 1997 ; 60 ( 5 ) : 1194
`201 .
`Hong , et al . , Tracking the origins and drivers of subclonal metastatic
`expansion in prostate cancer . Nature Communications , Apr. 1 , 2015 ;
`vol . 6 , No. 1 : pp . 1-12 . XP055501144 .
`Human Genome Overview GRCh37 . Genome Reference Consor
`tium , Feb. 27 , 2009. 1 Page .
`Human Genome Overview GRCh37.p13 . Genome Reference Con
`sortium , Jun . 28 , 2013. 2 Pages .
`
`Human Genome Overview GRCh38.p12 . Genome Reference Con
`sortium , Dec. 21 , 2017. 2 Pages .
`International search report and written opinion dated Apr. 23 , 2014
`for PCT / US2013 / 078123 .
`Li , et al . Novel computational methods for increasing PCR primer
`design effectiveness in directed sequencing . BMC Bioinformatics .
`Apr. 11 , 2008 ; 9 : 191 . doi : 10.1186 / 1471-2105-9-191 .
`Market , et al . , V ( D ) J Recombination and the Evolution of the
`Adaptive Immune System . PLOS Biol . 2003 ; 1 ( 1 ) : e16 . https : // doi .
`org / 10.1371 / journal.pbio.0000016 .
`Notice of allowance dated Jun . 3 , 2015 for U.S. Appl . No. 14 / 141,990 .
`Notice of Allowance dated Jun . 9 , 2017 for U.S. Appl . No. 15 / 222,875 .
`Office action dated Feb. 6 , 2015 for U.S. Appl . No. 14 / 141,990 .
`Office Action dated Feb. 27 , 2017 for U.S. Appl . No. 15 / 222,875 .
`Office action dated Jun . 5 , 2014 for U.S. Appl . No. 14 / 141,990 .
`Ralph , et al . , Consistency of VDJ Rearrangement and Substitution
`Parameters Enables Accurate B Cell Receptor Sequence Annota
`tion . PLOS computational biology , 2016 ; 12 ( 1 ) : e1004409 . https : //
`doi.org/10.1371/journal.pcbi.1004409 .
`U.S. Appl . No. 14 / 810,337 Notice of Allowance dated Feb. 28 ,
`2019 .
`U.S. Appl . No. 14 / 810,337 Notice of Allowance dated Jan. 18 , 2019 .
`U.S. Appl . No. 14 / 810,337 Office Action dated Apr. 9 , 2018 .
`U.S. Appl . No. 15 / 996,215 Notice of Allowance dated May 15 ,
`2019 .
`U.S. Appl . No. 15 / 996,215 Office Action dated Dec. 31 , 2018 .
`U.S. Appl . No. 17 / 078,857 Office Action dated Apr. 1 , 2021 .
`U.S. Appl . No. 17 / 078,857 Office Action dated Aug. 19 , 2021 .
`U.S. Appl . No. 17 / 078,857 Office Action dated Jul . 15 , 2021 .
`U.S. Appl . No. 17 / 078,857 Office Action dated Nov. 12 , 2021 .
`U.S. Appl . No. 17 / 080,474 Notice of Allowance dated Jul . 19 , 2021 .
`U.S. Appl . No. 17 / 080,474 Office Action dated Mar. 26 , 2021 .
`U.S. Appl . No. 17 / 235,776 Office Action dated Aug. 17 , 2021 .
`Pritchard et al . , “ ColoSeq Provides Comprehensive Lynch and
`Polyposis Syndrome Mutational Analysis Using Massively Parallel
`Sequencing ” , Journal of Molecular Diagnostics , vol . 14 , No. 4 , Jul .
`2012 .
`
`* cited by examiner
`
`Foresight EX1001-p. 2
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 1 of 15
`
`US 11,384,394 B2
`
`135
`
`130
`
`FIG . 1
`
`120
`
`101
`
`125
`
`105
`
`115
`
`110
`
`Foresight EX1001-p. 3
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 2 of 15
`
`US 11,384,394 B2
`
`Output
`
`Analysis 1
`
`Output
`
`Analysis 1
`
`Analysis 2
`
`Output
`
`Assay 1 - - Analysis
`
`Output
`
`Analysis 2
`
`Analysis 1
`
`Assay 1
`
`Assay 1
`
`Assay 2
`
`Assay 1
`
`Assay 2
`
`Assay 2
`
`Assay 3
`
`Prep 1
`
`Prep 2
`
`Prep 1
`
`Prep 2
`
`Prep 1
`
`Prep 2
`
`Protocol 1
`
`Protocol 2
`
`Protocol 3
`
`Protocol 4
`
`DNA
`
`DNA
`
`DNA
`
`Sample
`
`FIG . 2A
`
`Sample
`
`FIG . 2B
`
`FIG . 2C Sample
`
`
`
`Nucleic Acid
`
`FIG . 2D Sample
`
`Foresight EX1001-p. 4
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 3 of 15
`
`US 11,384,394 B2
`
`Content B
`
`Specialized
`Specialized
`Supplement C
`
`Content A
`
`FIG . 3
`
`Supplement B
`
`Supplement A
`Conventional
`
`Target
`
`Content
`
`Non - Sequencing Techonologies Orthogonal
`Technology Limited scope • Useful for
`
`
`
`redundancy , calibration
`Alternative Sample Prep / Chemistries
`Customized • Addresses
`specific problem categories
`
`Alternative Sequencing Technologies Costly , low
`categories volume • Address specific problem
`
`
`
`
`
`
`
`Pipeline
`
`
`
`
`
`Integration of data types for optimal statistics
`
`
`Integration of best of breed analytic tools
`
`
`
`Custom Preprosessing
`
`Sequencing Low
`
`
`
`
`cost & easy Mostly good quality
`
`Conventional
`
`
`
`. Most of genome High throughput
`
`Approach
`Assay
`
`Pipeline
`Bioinformatics
`
`Foresight EX1001-p. 5
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 4 of 15
`
`US 11,384,394 B2
`
`
`
`• Splice sites
`
`
`
`SNP sets
`
`Chan . • OMIM , etc ...
`
`FIG . 4
`
`Non - coding Elements • UTR
`Mendelian
`Variants • OMIM • HGMD • MendelDB
`• Varimed . Clin Var GeneReview • Clinical
`Regulatory Elements Regulome 1 Regulame 2
`Gene Supplement • PharmGKB • Mendelian . Clin . Panels
`Conventional Exome TruSeq . SureSelect • Nimblegen
`
`Key
`
`• Cancer , etc ...
`
`Content
`
`Hybridization Array
`Orthogonal Technology · High
`
`SNPs Common CNVS
`quality
`
`High GC Content Customized chemistry Improved
`coverage • Improved quality
`
`
`
`
`
`long STRs variants Spanning volume MiSeq , PacBio , 454 , CE Improved mapping Phasing
`
`
`
`
`
`
`Long Read Costly , low
`
`Conventional Low
`
`quality . Most of genome • High throughput
`
`cost & easy . Mostly good
`
`
`Assay
`
`
`
`
`
`Integration of best of breed analytic tools
`
`
`
`Personalis Pipeline
`
`
`
`• Integration of data types for optimal
`statistics
`
`Bioinformatics
`
`Foresight EX1001-p. 6
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 5 of 15
`
`US 11,384.394 B2
`
`150bp - 3755
`5. 500bp - 32s 6. 800bp - 25
`
`3. 300bp - 80s 4 : 400bp - 50s
`
`FIG . 5
`
`. 1
`
`{
`
`1 : 150bp - 3755 2. 200bp - 175S
`
`1
`
`1
`
`i
`
`!
`
`WI WWE
`
`11
`
`III | AUTIES ET III
`
`[ FU ]
`
`250
`
`200
`
`150
`
`100
`
`50
`
`0
`
`Preleprover
`
`uid
`
`TEL
`
`NIINIT
`
`[ bp ]
`
`10380
`
`
`
`1000 2000
`
`
`
`400 500 600
`
`300
`
`
`
`100 150 200
`
`35 ·
`
`1 ?? U
`
`Foresight EX1001-p. 7
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul.12.2022
`
`Sheet 6 of 15
`
`US 11,384.394 B2
`
`0.8X
`
`5 : 024X
`
`FIG . 6
`
`3.0.6X 4 : 0.5x
`
`1 ; 0.8X 2 : 0.7X
`
`[ bp ]
`
`1
`
`. 1 1
`
`11
`
`1
`
`UTHI
`
`I '
`
`!
`
`11
`H
`AD
`
`1
`
`10380
`
`2000
`1000
`
`
`
`500 600
`
`1
`
`1 1
`
`1
`
`?
`,
`
`it
`
`i
`
`12
`
`*
`
`1
`
`i
`1 0
`
`|
`
`1
`
`f
`
`] [
`
`t II
`
`1
`J
`
`7
`
`. |||
`
`|
`
`||
`
`}
`
`f
`
`||
`
`1
`
`1
`
`1 1
`
`1
`
`400
`1
`300
`
`1
`
`1 1
`1
`
`3
`
`H
`
`1
`1
`
`1
`
`1
`
`1
`
`) II
`
`||
`
`1
`
`Sa
`
`11
`
`ii
`
`
`
`100 150
`
`35
`
`1
`
`500
`
`[ FU ]
`
`1
`|
`
`||
`
`400
`
`300
`
`200
`
`100
`
`0
`
`Foresight EX1001-p. 8
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 7 of 15
`
`US 11,384,394 B2
`
`[ bp ]
`
`25S - lig - up
`
`5 : 25s - lig - low 6 : 32s - lig - low
`
`1
`
`i
`
`III TATION DIMIN UNI
`
`3 : 25s - lig - mid 4 : 32s - lig - mid
`
`FIG . 7
`
`11 II
`--- + - actua
`
`!!
`
`I
`
`1 1
`
`11
`
`??????
`
`11 OMI Il 100
`2LIWA NO H ? ?
`HO
`VA 23 AL
`
`GH
`
`1
`
`11
`
`DN NIKDO
`
`42
`
`
`
`1000 2000 5000 10380
`
`
`
`
`
`700
`
`500
`
`300
`
`4
`
`-H
`
`5
`
`6
`
`+
`
`1 : 25s - lig - up 2 : 32s - lig - up
`
`11
`
`?
`
`50
`
`[ FU )
`
`140
`
`120
`
`100
`
`80
`
`60
`
`40
`
`20
`
`0
`
`Foresight EX1001-p. 9
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 8 of 15
`
`US 11,384,394 B2
`
`VW
`
`-A - OH
`-P
`
`OH ' P
`
`P OH
`
`FIG . 8
`
`P
`
`OH - A .
`
`
`High GC region will form 2nd structure and
`
`
`
`prohibit normal End Repair
`
`repair 3 Spri 37 + ATP 27 ligation
`shear 27 End
`
`DNA
`
`Spri 7 Spri
`
`PCR 2 Spri
`
`Foresight EX1001-p. 10
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 9 of 15
`
`US 11,384,394 B2
`
`Analysis 1 - Output
`
`Output
`Merge Calls
`
`Analysis 1
`
`Output
`-
`Analysis 1
`
`HiSeq
`
`Merge Reads
`
`HiSeq
`
`Analysis 2
`
`PacBio RS
`
`Long
`
`Mol . Protocol
`
`FIG . 9
`
`Normal Protocol
`
`Mol .
`
`Pool Amplification
`
`Aliquot , Differential
`
`High GC Protocol
`
`Normal Prep
`
`
`Nucleic Acid
`Sample
`
`
`
`
`
`
`
`Preps Involve Separate Steps but Libraries are Pooled for Sequencing
`
`
`
`- ?
`
`MiSeq
`HiSeq
`Normal Protocol ]
`Normal Protocol
`
`Size Fractionation
`
`
`Sample Nucleic Acid
`
`May be identical Preps Molecules , Subsequent
`
`
`
`
`operation is also Enrichment for Long Subset
`
`
`but Sequencing is Not .
`
`Size Fractionation
`
`Nucleic Acid
`Sample
`
`Foresight EX1001-p. 11
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 10 of 15
`
`US 11,384,394 B2
`
`Gene Definitions
`
`Phasing or Reassembly Targets
`
`Gene
`
`Problematic Region Definitions
`
`Phasing Required
`Alternate Sequences
`Mapping Problems
`
`Complexity , STR , Expanding Repeats
`
`Low
`
`High GC Content
`Easy & Not Targeted
`Standard Exome
`
`
`
`
`
`Long Read Probes & Protocol
`
`High GC Probes & Protocol
`
`Supplementary Probes & Protocol
`
`Gene Regulatory Regions
`Compile
`Regulatory Databases ( Regulame )
`Personalis
`
`
`
`Variant Set
`
`Standard Exome Targets
`
`Personalis Net Content
`
`Classify & Prioritize Not OKI
`
`OK
`
`
`
`
`
`Regions Compile Target Gene
`
`
`
`Compile Content
`
`Alternate Sequences
`
`
`
`Compile Exon , UTR and Splice Site Lists
`
`
`
`
`
`
`
`FIG . 10
`
`
`
`Target Gene Sets , Clinical Panels , Lists
`
`Variants Regions
`Merge
`
`Variant & Mutation Databases
`
`Foresight EX1001-p. 12
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 11 of 15
`
`US 11,384,394 B2
`
`Biomedical Reports
`
`
`
`Specified Annotation & Interpretation
`
`
`
`Data Pool
`
`Biomedical
`
`Databases
`
`High GC Protocol & Pullout
`
`Sample
`
`FIG . 11
`
`DNA Extraction
`
`Fragmentation & End
`
`Repair
`
`Aliquot
`
`2x100bp Sequencing
`Hiseq
`
`
`
`
`
`Pullout Standard Protocol & Exome + Supplement
`
`Align , Analyze
`
`and / or Assemble Call Variants
`
`
`
`
`
`Foresight EX1001-p. 13
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 12 of 15
`
`US 11,384,394 B2
`
`Biomedical
`Reports
`Specified Annotation & Interpretation
`
`Data Pool
`
`Biomedical
`
`Databases
`
`2x100bp Sequencing
`HiSeq
`
`
`
`Align , Analyze , and / or Assemble , Call Variants
`
`High GC Protocol & Pullout
`
`Hi GC Fragmentation & End
`Repair
`
`
`
`Merge Variant Calls
`
`2x100bp Sequencing
`HiSeq
`
`
`
`Align , Analyze , and / or Assemble , Call Variants
`
`
`
`Std Fragmentation & End
`
`Repair
`
`
`
`
`
`Std . Protocol & Exame and Supplement Pullout
`
`FIG . 12
`
`Sample
`
`DNA Extraction
`
`Aliquot
`
`Foresight EX1001-p. 14
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 13 of 15
`
`US 11,384,394 B2
`
`Reports
`
`Biomedical
`Specified Annotation & Interpretation
`
`
`
`
`
`Data Pool
`
`Biomedical
`
`Databases
`
`2x100bp Sequencing
`HiSeq
`
`High GC Fragmentation & End
`Repair
`
`High GC Protocol & Pullout
`
`Sample
`
`FIG . 13
`
`Aliquot
`
`DNA Extraction
`
`Combine Reads .
`
`2x100bp Sequencing
`HiSeq
`
`
`
`
`
`Pullout Standard Protocol & Exome + Supplement
`
`Repair
`
`
`
`Std Fragmentation & End
`
`Align , Analyze
`
`and / or Assemble Varlants Call
`
`
`
`
`
`Foresight EX1001-p. 15
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 14 of 15
`
`US 11,384,394 B2
`
`High GC Protocol & Pullout
`
`Databases
`
`Biomedical
`2x100bp Sequencing
`HiSeg
`
`
`
`
`
`Pullout Standare Protocol & Exome + Supplement
`
`Biomedical Reports
`
`
`
`Specified Annotation & Interpretation
`
`
`
`Data Pool
`
`Combin eReads
`
`
`
`Align , Analyze , and / or Assemble , Call Variants
`
`FIG . 14
`
`Sample
`
`
`
`DNA Extraction
`
`Fragmentation & End
`
`Repair
`
`Aliquot
`
`Short
`Size Selection
`Long
`
`2x250bp Sequencing
`PacBio RS 5kbp or MiSeq
`
`
`
`Long Read Pullout & Read
`Prep
`
`Foresight EX1001-p. 16
`Foresight v Personalis
`
`
`
`U.S. Patent
`
`Jul . 12 , 2022
`
`Sheet 15 of 15
`
`US 11,384,394 B2
`
`
`
`High GC Protocol and Pullout
`
`Blomedical
`
`Databases
`
`Aliquot
`
`2x100 bp Sequencing
`
`HiSeq
`
`
`
`
`
`Pullout Standard Protocol & Exome + Supplement
`
`
`
`
`
`Align , Analyze and / or Assemble Call Variants
`
`Biomedical Reports
`
`Specified Annotation & Interpretation
`
`Www
`
`
`
`Data Pool
`
`
`
`Merge Variant Calls
`
`FIG . 15
`
`Sample
`
`DNA Extraction
`
`Fragmentation and End
`
`Repair
`
`Short
`
`Size Selection
`Long
`
`2x250bp Sequencing
`PacBio RS 5 kbp or MiSeq
`
`
`
`Long Read Pullout and
`
`
`Long Read Prep
`
`Align , Analyze
`
`
`
`
`
`and / or Assemble Call Variants
`
`Foresight EX1001-p. 17
`Foresight v Personalis
`
`
`
`US 11,384,394 B2
`
`a
`
`a
`
`a
`
`a
`
`1
`METHODS AND SYSTEMS FOR GENETIC
`ANALYSIS
`
`2
`or more assays on the first combined pool of nucleic acid
`molecules , wherein at least one of the one or more assays
`comprises a sequencing reaction .
`Disclosed herein is a method for analyzing a nucleic acid
`CROSS - REFERENCE
`5 sample , comprising ( a ) producing two or more nucleic acid
`molecules subsets from
`nucleic acid sample , wherein
`This application is a continuation of U.S. patent applica
`producing the two or more nucleic acid molecules comprise
`tion Ser . No. 17 / 078,857 , filed Oct. 23 , 2020 , which is a
`enriching the two or more subsets of nucleic acid molecules
`continuation of U.S. patent application Ser . No. 16 / 816,135 ,
`for two or more different genomic regions ; ( b ) conducting a
`filed Mar. 11 , 2020 , which is a continuation of U.S. patent
`application Ser . No. 16 / 526,928 , filed Jul . 30 , 2019 , which 10 first assay on a first subset of nucleic acid molecules among
`is a continuation of U.S. patent application Ser . No. 15/996 ,
`the two or more subsets of nucleic acid molecules to produce
`a first result , wherein the first assay comprises a first
`215 , filed Jun . 1 , 2018 , now U.S. Pat . No. 10,415,091 , which
`is a continuation of U.S. patent application Ser . No. 14/810 ,
`sequencing reaction ; ( c ) conducting a second assay on at
`a
`least a second subset of nucleic acid molecules among the
`337 , filed Jul . 27 , 2015 , now U.S. Pat . No. 10,266,890 ,
`which is a divisional of U.S. patent application Ser . No. 15 two or more subsets of nucleic acid molecules to produce a
`second result ; and ( d ) combining , with the aid of a computer
`14 / 141,990 , filed Dec. 27 , 2013 , now U.S. Pat . No. 9,128 ,
`861 , which claims priority to U.S. Provisional Application
`processor , the first result with the second result , thereby
`No. 61 / 753,828 , filed Jan. 17 , 2013 , each of which is
`analyzing the nucleic acid sample .
`Further provided herein is a method for analyzing a
`incorporated herein by reference in its entirety .
`20 nucleic acid sample , comprising ( a ) preparing at least a first
`a
`subset of nucleic acid molecules and a second subset of
`BACKGROUND
`nucleic acid molecules from a nucleic acid sample , wherein
`the first subset of nucleic acid molecules differs from the
`Current methods for whole genome and / or exome
`second subset of nucleic acid molecules ; ( b ) conducting a
`sequencing may be costly and fail to capture many biomedi-
`cally important variants . For example , commercially avail- 25 first assay on the first subset of nucleic acid molecules and
`able exome enrichment kits ( e.g. , Illumina's TruSeq exome
`a second assay on the second subset of nucleic acid mol
`enrichment and Agilent's SureSelect exome enrichment ) ,
`ecules , wherein the first assay comprises a nucleic acid
`may fail to target biomedically interesting non - exomic and
`sequencing reaction that produces a first result , comprising
`exomic regions . Often , whole genome and / or exome
`nucleic acid sequence information for the first subset , and
`sequencing using standard sequencing methods performs 30 wherein the second assay produces a second result ; ( c )
`poorly in content regions having very high CG content
`analyzing , with the aid of a computer processor , the first
`result to provide a first analyzed result and analyzing the
`( > 70 % ) .
`Furthermore , whole genome and / or
`exome
`sequencing also fail to provide adequate and / or cost - effec-
`second result to provide a second analyzed res
`and ( d )
`tive sequencing of repetitive elements in the genome .
`combining , with the aid of a computer processor , the first
`The methods disclosed herein provide specialized 35 and second analyzed results , thereby analyzing the nucleic
`acid sample .
`sequencing protocols or technologies to address these issues .
`Provided herein is a method for analyzing a nucleic acid ,
`comprising ( a ) producing one or more subsets of nucleic
`SUMMARY
`acid molecules from a nucleic acid sample , wherein pro
`Provided herein is a method for analyzing a nucleic acid 40 ducing the one or more subsets of nucleic acid molecules
`comprises conducting a first assay in the presence of one or
`sample , comprising ( a ) producing two or more subsets of
`nucleic acid molecules from a nucleic acid sample , wherein
`more antioxidants to produce a first subset of nucleic acid
`( i ) the two or more subsets comprise a first subset of nucleic
`molecules ; and ( b ) conducting a sequencing reaction on the
`acid molecules and a second subset of nucleic acid mol-
`one or more subsets of nucleic acid molecules , thereby
`ecules , and ( ii ) the first subset of nucleic acid molecules 45 analyzing the nucleic acid sample .
`differs from the second subset of nucleic acid molecules by
`Also disclosed herein is a method for analyzing a nucleic
`one or more features selected from genomic regions , mean
`acid sample , comprising ( a ) producing , with the aid of a
`GC content , mean molecular size , subset preparation
`computer processor , one or more capture probes , wherein
`method , or combination thereof ; ( b ) conducting one or more
`the one or more capture probes hybridize to one or more
`assays on at least two of the two or more subsets of nucleic 50 polymorphisms , wherein the one or more polymorphisms
`acid molecules , wherein ( i ) a first assay , comprising a first
`are based on or extracted from one or more databases of
`sequencing reaction , is conducted on the first subset of the
`polymorphisms , observed in a population of one or more
`two or more subsets to produce a first result , and ( ii ) a
`samples , or a combination thereof ; ( b ) contacting a nucleic
`second assay is conducted on the second subset of the two
`acid sample with the one or more capture probes to produce
`or more subsets to produce a second result ; and ( c ) com- 55 one or more capture probe hybridized nucleic acid mol
`bining , with the aid of a computer processor , the first result
`ecules ; and ( c ) conducting a first assay on the one or more
`and second result , thereby analyzing the nucleic acid
`capture probe hybridized nucleic acid molecules , thereby
`sample .
`analyzing the nucleic acid sample , wherein the first assay
`Also provided herein is a method for analyzing a nucleic
`comprises a sequencing reaction .
`Further disclosed herein is a method for developing
`acid sample , comprising ( a ) producing two or more subsets 60
`of nucleic acid molecules from a nucleic acid sample ,
`complementary nucleic acid libraries , comprising ( a ) pro
`wherein the two or more subsets differ by one or more
`ducing two or more subsets of nucleic acid molecules from
`features selected from genomic regions , mean GC content ,
`a sample , wherein ( i ) the two or more subsets of nucleic acid
`mean molecular size , subset preparation method , or combi-
`molecules comprise a first subset of nucleic acid molecules
`nation thereof ; ( b ) combining at least two of the two or more 65 and a second subset of nucleic acid molecules , ( ii ) the first
`subsets of nucleic acid molecules to produce a first com-
`subset of nucleic acid molecules comprises nucleic acid
`bined pool of nucleic acid molecules ; and ( c ) conducting one
`molecules of a first mean size , ( iii ) the second subset of
`
`Foresight EX1001-p. 18
`Foresight v Personalis
`
`
`
`US 11,384,394 B2
`
`5
`
`a
`
`3
`4
`Disclosed herein is a method for sequencing , comprising
`nucleic acid molecules comprises nucleic acid molecules of
`( a ) contacting a nucleic acid sample with one or more
`a second mean size , and ( iv ) the first mean size of the first
`capture probe libraries to produce one or more capture probe
`subset of nucleic acid molecules is greater than the second
`hybridized nucleic acid molecules ; and ( b ) conducting one
`mean size of the second subset of nucleic acid molecules by
`or more sequencing reactions on the one or more capture
`about 200 or more residues ; ( b ) producing two or more
`probe hybridized nucleic acid molecules to produce one or
`nucleic acid libraries , wherein ( i ) the two or more libraries
`more sequence reads , wherein ( i ) the sensitivity of the
`comprise a first nucleic acid molecules library and second
`sequencing reaction is improved by at least about 4 % as
`nucleic acid molecules library , ( ii ) the first nucleic acid
`compared current sequencing methods ; ( ii ) the sensitivity of
`molecules library comprises the one or more nucleic acid
`molecules from the first subset of nucleic acid molecules , 10 the sequencing reaction for a genomic region comprising a
`RefSeq is at least about 85 % , ( iii ) the sensitivity of the
`( iii ) the second nucleic acid molecules library comprises the
`sequencing reaction for a genomic region comprising an
`one or more nucleic acid molecules from the second subset
`interpretable genome is at least about 88 % , ( iv ) the sensi
`of nucleic acid molecules , and ( iv ) the content of the first
`tivity of the sequencing reaction for an interpretable variant
`nucleic acid molecules library is at least partially comple- 15 is at least about 90 % , or ( v ) a combination of ( i ) - ( ii ) .
`mentary to the content of the second nucleic acid molecules
`At least one of the one or more capture probe libraries
`library .
`may comprise one or more capture probes to one or more
`genomic regions .
`Provided herein is a method for developing complemen-
`The methods and systems disclosed herein may further
`tary nucleic acid libraries , comprising ( a ) producing two or
`more subsets of nucleic acid molecules from a sample of 20 comprise conducting one or more sequencing reactions on
`nucleic acid molecules , wherein the two or more subsets of
`one or more capture probe free nucleic acid molecules .
`nucleic acid molecules comprise a first subset of nucleic acid
`The percent error of the one or more sequencing reactions
`molecules and a second subset of nucleic acid molecules ; ( b )
`may similar to current sequencing methods . The percent
`conducting two or more assays on the two or more subsets
`error rate of the one or more sequencing reactions may be
`of nucleic acid molecules , wherein ( i ) the two or more 25 within about 0.001 % , 0.002 % , 0.003 % , 0.004 % , 0.005 % ,
`assays comprise a first assay and a second assay , ( ii ) the first
`0.006 % , 0.007 % , 0.008 % , 0.009 % , 0.01 % , 0.02 % , 0.03 % ,
`assay comprises conducting a first amplification reaction on
`0.04 % , 0.05 % , 0.06 % , 0.07 % , 0.08 % , 0.09 % , 1 % , 1.1 % ,
`the first subset of nucleic acid molecules to produce one or
`1.2 % , 1.3 % , 1.4 % , 1.5 % , 1.6 % , 1.7 % , 1.8 % , 1.9 % , or 2 % of
`more first amplified nucleic acid molecules with a first mean
`the current sequencing methods . The percent error of the one
`a
`GC content , ( iii ) the second assay comprises conducting a 30 or more sequencing reactions is less than the error rate of
`second amplification reaction on the second subset of
`current sequencing methods . The percent error of the
`nucleic acid molecules to produce one or more second
`sequencing reaction may be less than about 1.5 % ,
`1 % ,
`amplified nucleic acid molecules with a second mean GC
`0.75 % , 0 .
`0.25 % , 0.10 % , 0.075 % , 0.050 % , 0.025 % , or
`content , and ( iv ) the first mean GC content of the first subset
`0.001 % .
`of nucleic acid molecules differs from the second mean GC 35
`The accuracy of the one or more sequencing reactions
`content of the second subset of nucleic acid molecules ; and
`may similar to current sequencing methods . The accuracy of
`( b ) producing two or more nucleic acid libraries , wherein ( i )
`the one or more sequencing reactions is better than current
`the two or more libraries comprise a first nucleic acid
`sequencing methods .
`molecules library and second nucleic acid molecules library ,
`The nucleic acid molecules may be DNA . The nucleic
`( ii ) the first nucleic acid molecules library comprises the one 40 acid molecules may be RNA .
`or more first amplified nucleic acid molecules , ( iii ) the
`The methods and systems may comprise a second subset
`second nucleic acid molecules library comprises the one or
`of nucleic acid molecules . The first subset and the second
`more second amplified nucleic acid molecules , and ( iv ) the
`subset of nucleic acid molecules may differ by one or more
`content of the first nucleic acid molecules library is at least
`features selected from genomic regions , mean GC content ,
`partially complementary to the content of the second nucleic 45 mean molecular size , subset preparation method , or combi
`acid molecules library .
`nation thereof .
`Also provided herein is a method for developing comple-
`The one or more genomic regions may be selected from
`mentary nucleic acid libraries , comprising ( a ) producing two
`the group comprising high GC content , low GC content , low
`or more subsets of nucleic acid molecules from a sample of
`complexity , low mappability , known single nucleotide varia
`nucleic acid molecules , wherein ( i ) the two or more subsets 50 tions ( SNVs ) , known inDels , known alternative sequences ,
`of nucleic acid molecules comprise a first subset of nucleic
`entire genome , entire exome , set of genes , set of regulatory
`acid molecules and a second subset of nucleic acid mol-
`elements , and methylation state .
`ecules , and ( ii ) the two or more subsets of nucleic acid
`The set of genes may selected from a group comprising
`molecules differ by one or more features selected from
`set of genes with known Mendelian traits , set of genes with
`genomic regions , mean GC content , mean molecular size , 55 known disease traits , set of genes with known drug traits ,
`subset preparation method , or combination thereof ; and ( b )
`and set of genes with known biomedically interpretable
`producing two or more nucleic acid libraries , wherein ( i ) the
`variants .
`two or more libraries comprise a first nucleic acid molecules
`The known alternative sequences may be selected from
`library and second nucleic acid molecules library , ( ii ) the
`the group comprising one or more small insertions , small
`first nucleic acid molecules library comprises the one or 60 deletions , structural variant junctions , variable length tan
`more nucleic acid molecules from the first subset of nucleic
`dem repeats , and flanking sequences .
`acid molecules ,
`( iii ) the second nucleic acid molecules
`The subsets of nucleic acid molecules may differ by mean
`library comprises the one or more nucleic acid molecules
`molecular size . The difference in mean molecular size
`from the second subset of nucleic acid molecules , and ( iv )
`between at least two of the subsets of nucleic acid molecules
`the content of the first nucleic acid molecules library is at 65 is at least 100 nucleotides . The difference in mean molecular
`least partially complementary to the content of the second
`