`(12) Patent Application Publication (10) Pub. No.: US 2005/0221341 A1
`
`Shimkets et al.
`(43) Pub. Date:
`Oct. 6, 2005
`
`US 20050221341A1
`
`(54) SEQUENCE-BASED KARYOTYPING
`
`Related US. Application Data
`
`(76)
`
`Inventors: Richard A. Shimkets, Guilford, CT
`(US); Michael S. Braverman, New
`Haven, CT (US)
`
`(60) Provisional application No. 60/513,691, filed on Oct.
`22, 2003 Provisional application No. 60/513,319,
`filed on Get 23’ 2003'
`Publication Classification
`
`.
`Correspondence Address.
`MINTZ LEVIN COHN FERRIS GLOVSKY &
`POPEO
`
`666 THIRD AVENUE
`NEW YORK, NY 10017 (US)
`
`(21) Appl. No.2
`
`10/971,614
`
`22
`
`F1 d
`1 e :
`
`0 22 2004
`ct.
`,
`
`(51)
`
`Int. Cl.7 ............................ C12Q 1/68; G06F 19/00;
`G01N 33/48' G01N 33/50
`................................................... ’435/6' 702/20
`’
`
`(52) US. Cl.
`
`(57)
`
`ABSTRACT
`
`A new method for genomic analysis, termed “Sequence-
`Based Karyotyping,” is described. Sequence-Based Karyo-
`typing methods for the detection of genomic abnormalities,
`for diagnosis of hereditary disease, or for diagnosis of
`'
`'
`l
`d
`'b d
`spontaneous gen0m1c mutat10ns are a s0
`escr1 e .
`
`Petitioner Sequenom - EX. 1004, p. 1
`
`Petitioner Sequenom - Ex. 1004, p. 1
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 1 0f 23
`
`US 2005/0221341 A1
`
`High Correlation of Digital Karyotyping and
`T_._._
`Sequence Based Karyotyping "Chromosome Content" Estimates
`
`
`
`DigitalKaryotyping
`
`
`
`"ChromosomeContent"
`
`1.75
`
`Figure 1
`
`
`
`.
`i
`'
`
`.
`E
`'
`
`0 Intermediate Content
`I Loss of Chromosome
`
`'9W
`
`.
`i
`‘
`
`,
`?
`‘
`
`.
`i
`'
`
`,
`§
`'
`
`
`1.25
`1.50
`1.75
`2.00
`2.25
`2.50
`2.75
`3.00
`3.25
`3.50
`3.75
`
`Sequence Based Karyotyping (DiFi/GM12911 Ratio * 2)
`
`Petitioner Sequenom - EX. 1004, p. 2
`
`Petitioner Sequenom - Ex. 1004, p. 2
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 2 of 23
`
`US 2005/0221341 A1
`
`“' snaiuén‘fé‘mpfiagafiéii
`'
`’
`Sequence5BasedKathypmgP‘mfilei.
`
`‘
`
`fléiifiofiéfitiéligégsmzadaiedra’i ameieaiesdlmio'm; ,Cbpies
`
`
`
`
`
`"perhaploidgenome
`
`o
`
`.
`
`Figure 2
`
`50
`
`1,“
`2N1! Fl§§ti§dlefifiésiifl§fi (ME)
`
`iii:
`
`Petitioner Sequenom - EX. 1004, p. 3
`
`Petitioner Sequenom - Ex. 1004, p. 3
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 3 of 23
`
`US 2005/0221341 A1
`
`A i
`
`_
`
`,
`
`.-. A _
`
`A
`
`i. ...__ _...._M..-i-.
`
`.. . .
`
`,.
`
`.. -mmwwwm
`
`...
`
`i ___i_.,._‘..... ...
`
`,
`
`
`
`
`
`yéo’fiiésperfiapioldlgenome
`
`MEifiliiéniiGhfifiiBéimiZ
`.
`«Seguenee‘é’B‘ased Kauoqging Pito’fi'le
`‘
`(eiéiidfifefiffiliy: Eiiibfo‘ffiféfl fiféAMB, Rééféluliéfijfi
`
`
`
`
`
`
`0‘
`
`5|)":
`
`“1001‘:
`
`150
`
`52003
`
`iNflEEijfifiééfiésflififé’fi (MB)
`
`Figure 3
`
`Petitioner Sequenom - EX. 1004, p. 4
`
`Petitioner Sequenom - Ex. 1004, p. 4
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 4 0f 23
`
`US 2005/0221341 A1
`
`
`
`Whole Genome
`Se uencin
`
`Sequence-Based
`Karyotyping
`
`Sequence-Based
`Expression
`
`Genome-Wide
`Meth Iation
`
`m
`
`(I
`
`4
`
`Cell Population
`
`Sequencing
`
`<1-SinleGen-
`
`”Gm-Pair
`
`Complex Sample
`Sequencing
`
`FIG. 4A
`
`5
`
`10
`
`INFECTIOUS
`DISEASE
`
`
`INFLAMMATION
`
`DIAGNOSTICS
`
`ONCOLOGY
`
`
`
`
`
`Whole-GenomeSe_uencino _ethlation_enome-Wide
`
`
`
`
`
`
`
`_0mplexSe_uencinoSample
`
`Sequence-Based
`Ka o n-in
`
`
`Sequence-Based
`Ka 0 coin
`
`Sequence-Based
`Ex ression
`
`Sequence-Based
`Ex ression
`
`Cell Population
`Sen-uencin
`
`Cell Population
`Sec-uencin
`
`FIG. 4B
`
`Petitioner Sequenom - EX. 1004, p. 5
`
`Petitioner Sequenom - Ex. 1004, p. 5
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 5 of 23
`
`US 2005/0221341 A1
`
`Genome
`
`A9!
`lndus‘lrlal
`
`Dwml
`Diagnostics
`
`Academic!
`Bbdefenso
`PublicHsaflh Govemnent
`
`Virus and
`Bacteda
`
`Human A
`
`M066! Digs.
`
`oamakmummn
`OEnrlchhrDNApo beam
`oLoad Whom.
`
`
`
`
`
`
`
`
`. Luau Pko‘merPlatomom Cmmfac 0 Load Sufluryladeucfl'om Beads
`0 RunSuendn Reactions
`0 Load 2mm: 11°" mm:
`.7.
`P.
`..*_’w‘
`,
`
`
`
`Petitioner Sequenom - EX. 1004, p. 6
`
`Petitioner Sequenom - Ex. 1004, p. 6
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 6 of 23
`
`US 2005/0221341 A1
`
`Million wells
`available per
`plate
`
`Reactants diffuse in
`Products diffuse out
`
`Photons Generated
`and Detected
`
`
`
`
`
`
`
`ach genome
`sembie into sequence of whole gen
`
`e
`
`.' Sequence Overlapping Fiagmems from
`\~_._\\A§\
`x;_%
`AK“
`“We
`
`Q
`
`-
`
`Identify similar genes as key intervemion points for broad-based anfibiofim
`identify genes commanding to drug resistance
`Identify pathways by conservations of sets of genes
`
`FIG. 8
`
`Petitioner Sequenom - EX. 1004, p. 7
`
`Petitioner Sequenom - Ex. 1004, p. 7
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 7 of 23
`
`US 2005/0221341 A1
`
`Gmmi)(:::alreference)
`GenomeA(disease)
`i?
`in???
`g‘f'e“ 01in
`fig": g‘éég
`1
`3:
`ital;
`ill: 3: w;
`i:
`M
`5i
`.1 f)
`.04 n n u
`f I
`19 20
`21 22 1'19 20
`21 22
`Y
`I
`1
`
`~
`
`Sequence Fragments from each genome & Locate
`Individual Fraamems on Map of Human Chromosomes
`\\
`“\\-\
`
`«g
`:3
`
`\
`
`3%:
`“Q
`
`Compare to identify regions that are amplified (potential oncogenes and targets)
`and regions that are lost (potential tumor suppressor genes)
`identify other defectsIn chromosome oompmition
`
`FIG. 9
`
`Tissue 8 (normal reference)
`Tissue A (disease)
`
`
`
`
`
`
`
`Sequence fragment of each RNA (oDNA), Count percentage
`(or number) of the time, each gene is found
`\\
`
`\
`
`23s
`i:§.
`
`w
`
`1%;
`“1r} 2:;
`
`Compare among samples to determine significam
`differences in gene expression or gene splicing
`
`FIG. 10
`
`Petitioner Sequenom - EX. 1004, p. 8
`
`Petitioner Sequenom - Ex. 1004, p. 8
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 8 0f 23
`
`US 2005/0221341 A1
`
`
`Genome A (Treamd)
`Genome B (Untreated)
`
`
`
`Treat sample with sodium blsulfite to
`prom methyl-cytosine bases. Non-
`methylated cyboslnes become uradls.‘
`
`'
`
`x; _. wseqfieci‘hmafimmww
`«INTEL
`WW
`
`ovations of
`
`nesg
`
`Compare to reference samples to determine sites of methylation In
`response to ageing, disease progression, drug treatment or other factors
`
`
`
`Petitioner Sequenom - EX. 1004, p. 9
`
`Petitioner Sequenom - Ex. 1004, p. 9
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 9 of 23
`
`US 2005/0221341 A1
`
`
`
`FIG. 13A
`
`Enrich for Beads
`that Have Both
`
`Amplified Bait and
`
`mmnua+m«mm
`Prey Genes
`
`
`
`
`Cells with genes of
`interasi
`
`FIG. 13B
`
`Petitioner Sequenom - EX. 1004, p. 10
`
`Petitioner Sequenom - Ex. 1004, p. 10
`
`
`
`Patent Application Publication
`
`Oct. 6, 2005 Sheet 10 of 23
`
`US 2005/0221341 A1
`
`<3.OE
`
`
`
`
`l.m“.m.mnmum3x:29:ESE9.683%8523:55mom:m-Sam39.3.8.1-.cdao...m.&a_a39%;,“-mm-1"m---.c.=wuu_.=.._.m.-----------------------w_._ma@.,o..m1"n383egosans"n."“.m.0mm|\\I.mrm.mmn.m.m".MIpmmmzmebhm.--:::--------------.U.=.%_.m.:uago:_------33----:--§8.---:::-:EDN.V-::"msaw?
`
`
`
`
`532::“gamma<zo_w<5%?$9025
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`m:.®_u_
`
`
`
`:anIcomuoumupmomumumumuulocumuuuoououpooaumuUImuaUI_m--..mloomulmmomoammmmmmmummmumwlumomoumumummmmmm.m_mmmommomomumumumm:mmmumommmmummomumomouumoml.m.-._mIovum?uumuuoopuuuuumpuumuuumuuoumumpmpuuuuuuumomEgg5.<32:5%:
`
`
`
`<29foamE
`
`m5325
`
`
`
`mEEEmmm_>__>_me
`
`0:.®_u_
`
`.m230$2%mun.-38$5558m-ExE3933<zo:3;E-38$5&58w-ass5&5mugm22595% :mEoSm
`
`own:m§an<E9:9220<§m2cmEoBm38v
`
`
`
`
`
`
`
`Aofimrflaéuaeamom.EONEEEUmménigrcmemmc<20:simxégmrwefigménatmgamoarm2223
`
`Petitioner Sequenom - Ex. 1004, p. 11
`
`Petitioner Sequenom - Ex. 1004, p. 11
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 11 of 23
`
`US 2005/0221341 A1
`
`15A
`
`-
`
`5.
`
`Niel: 2
`
`UniversalAdaptor
`Aiubp}
`
`gDNAfragment l Universai Adaptor
`{>200pr
`Bi44bp)
`3.
`3'———--—-——-———-—-——- Biotin 5'
`.t
`Nicked double-stranded DNA
`"m“
`Addition ostt DNA Poiyrnerase
`
`Nickie
`
`1 5B
`
`Universa! Adaptor
`5.
`A (44pr
`
`gDNA fragment
`(>200bp)
`
`Universal Adapter
`B (44hr!)
`3.
`Brotm 5
`Bat
`851 DNA Polymerase binds single-stranded gaps.
`strand displaces nicked strand and extends fragment
`
`3'
`
`' Bst
`Nick 1
`
`15C
`
`85"
`
`.
`gDNA fragment Unwsal Adaptor
`Universal Adaptor
`3.
`{>200bpi
`5. A (44b?)
`B (Him)
`/3-__._______—._——————— Biotin 5'
`Result is non-nicked
`
`39¢
`
`double-shamed DNA fragment
`
`1 5D
`
`Universal Adaptor
`gDNA fragment
`Universal Adaptor
`3.
`B imp)
`[>200bpi
`5.
`A (44129)
`3'—-——-——--————-—————- Biotin 5'
`
`Petitioner Sequenom - Ex. 1004, p. 12
`
`Petitioner Sequenom - Ex. 1004, p. 12
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 12 of 23
`
`US 2005/0221341 A1
`
`16A
`
`
`
`
`16B
`
`Petitioner Sequenom - EX. 1004, p. 13
`
`Petitioner Sequenom - Ex. 1004, p. 13
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 13 of 23
`
`US 2005/0221341 A1
`
`1 7
`
`Schematic Pincus How for Bead Separation
`
`O
`
`O
`O
`O
`O
`1.0 43.1.. 0 _,B 2. O __,3._5A_@
`0% 0% o®6
`0053
`
`l4
`
`_,B-SA-@/ ‘3
`+
`
`_,B-—5A—@| _§ O O_,B——5A—@I
`@—
`é
`o®E>
`é
`
`@—
`
`18
`
`”W
`
`
`——-m
`
`
`
`
`
`
`
`1 9
`
`Primer Candidams by Tm
`Bx19x19x19x9 tetrads (493,848 total possibilities)
`
`130799
`
`
`
`60t062
`
`62t064
`
`
`BEtoEB
`640.166
`Tm'2‘IA+11+4'(G+C]
`
`Gato70
`
`70t072
`
`Petitioner Sequenom - Ex. 1004, p. 14
`
`
`
`150000
`
`140000
`
`120000
`
`._ 100000
`
`5E
`
`Hm3
`Z
`
`Petitioner Sequenom - Ex. 1004, p. 14
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 14 of 23
`
`US 2005/0221341 A1
`
`
`
`Petitioner Sequenom - EX. 1004, p. 15
`
`Petitioner Sequenom - Ex. 1004, p. 15
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 15 of 23
`
`US 2005/0221341 A1
`
`Figure 2 1
`
`2A
`
`Saved 304m
`NHS—Acfimtnd
`
`NHS-Acfiuatcd
`
`Scphamse Bead
` .-Hcg- Tempran DNA
`
`NH3.-Hcg—
`
`REV Fri
`NH3.-Heg-“——mu—
`WV:
`
`Sicvcd 30-45mm
`.
`.
`Nl'iSrActNatcd
`Scpharm: Bad
`
`REV Stra d
`“Ha-«qrfifpba
`4
`
`(lawn I
`
`CL13min "
`
`____‘,...a
`
`_
`MiG-Acmtcd
`Scphamse Bean!
`
`REV 5mm:
`NH3'-Hng-'"_—T/‘—’
`4
`
`smptavidin -’
`
`""
`
` NH3.'HCQ_
`
`-Hcg-
`
`BiotinJ
`
`Streptavidin "
`
`FND Strand
`
`NH3'-Hcg-' ddm'P——l_’
`
`
`
`
`
`.
`
`FWD Strand
`-HCQ-- ddNTP._.r—'
`
`NH:
`
`FWD Strand
`.
`NH3 -Heg-. ddm—f
`
`5m w—‘Qfl‘fl
`HHS-Activated
`
`Snapsvimfi
`
`Petitioner Sequenom - EX. 1004, p. 16
`
`Petitioner Sequenom - Ex. 1004, p. 16
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 16 of 23
`
`US 2005/0221341 A1
`
`Figure 21 (con' t)
`
` /
`
`Streptavidin
`
`Samarium"
`
`
`
`_ 15f (FWD)
`40. Strand
`
`20
`
`CAGTCAGTCAGTCAGTCAGTCAGTCAGTCAGETCAGTCA‘GT
`
`2nd (REV)
`Strand
`
`
`
`Jm
`
`um
`
`
`lFWD} -_
`
`
`-mm1_1352_16aLWdi 3
`
`
`
`
`Petitioner Sequenom - EX. 1004, p. 17
`
`Petitioner Sequenom - Ex. 1004, p. 17
`
`
`
`Patent Application Publication
`
`Oct. 6, 2005 Sheet 17 of 23
`
`US 2005/0221341 A1
`
`Figure 22
`
`
`
`Petitioner Sequenom - EX. 1004, p. 18
`
`Petitioner Sequenom - Ex. 1004, p. 18
`
`
`
`Patent Application Publication
`
`Oct. 6, 2005 Sheet 18 of 23
`
`US 2005/0221341 A1
`
`Figure 23
`
`
`
`.9..r.as...«
`
`w“.u
`
`Petitioner Sequenom - EX. 1004, p. 19
`
`Petitioner Sequenom - Ex. 1004, p. 19
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 19 of 23
`
`US 2005/0221341 A1
`
`Figure 24
`
`
`
` ~ Fem, sag;
`
`
`
`.IIIIIIWIIMT"
`
`
`PEI:
`I
`éTermination withdd NTEsIdNIPs (CAP);
`
`
`
`
`‘
`7 PCRI S‘eq‘l‘
`
`
`
`i
`
`
`
`I
`LPEI‘
`Deblocking 2nd prim'er‘wi'thi CI‘AP [GUI]
`
`Illlillllfifi‘fit‘fifi r
`
`_ PCR1 3qu
`i
`
`
`
`
`
`m
`
`—I,
`
`Sequencing find segment [CONTINUE]
`
`PCFH $8.591:
`
`
`
`r,
`
`I
`
`"
`
`
`
`‘
`
`V.......
`r
`)
`we "71.: _: V7
`1.7...
`....i .......
`
`
`IIIIIIIIWS
`
`
`Petitioner Sequenom - EX. 1004, p. 20
`
`Petitioner Sequenom - Ex. 1004, p. 20
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 20 of 23
`
`US 2005/0221341 A1
`
`Figure 25
`
`A
`25250
`25050 25100 25150 25200
`25000
`24900 24950
`
`Staphyiococcus
`
`Jonas
`
`
`Overiap'ping regions
`
`ave. fold 0.7
`
`—
`
`:1
`
`
`
`
`=2:I~-~‘~-I:3Paired Read
`
`=Uhp03fed:
`Read
`:11:
`
`B
`
`
`
`
`
`
`
`'
`Total 2nd Strand;
`
`-_
`
`
`
`
`
`
`
`
`
`
`Petitioner Sequenom - Ex. 1004, p. 21
`
`Petitioner Sequenom - Ex. 1004, p. 21
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 21 of 23
`
`US 2005/0221341 A1
`
`Figure 26
`
`Average 95:3 bases
`
`
`
`é
`
`ALu0OO
`
`
`
`Numberof‘well's
`
`
`
`
`
`"0; m '20 30 40 so 6:0 70: so 90'106‘1i10120130140
`Read Length Ibasesi)
`
`
`
`Petitioner Sequenom - Ex. 1004, p. 22
`
`Petitioner Sequenom - Ex. 1004, p. 22
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 22 of 23
`
`US 2005/0221341 A1
`
`Figure 27
`
`1 800
`
`1 600
`
`Numberofwells
`
`
`
`
`
`.'oo"‘F3."§§8‘§8§.
`
`,
`
`M(3C
`
`O
`
`0
`
`40 so 1120 160 200 240 280 4320 350 400 “(5480‘ 520 Sat)
`Genome Span
`
`Petitioner Sequenom - EX. 1004, p. 23
`
`Petitioner Sequenom - Ex. 1004, p. 23
`
`
`
`Patent Application Publication Oct. 6, 2005 Sheet 23 of 23
`
`US 2005/0221341 A1
`
`Figure 28
`
`name
`
`
`m-“
`momma smaa._
`__
`__
`mosamm wanes
`—_
`—-_
`—m-_ Anmcrcremrecmcrremcemmm
`—_-
`_-_
`
`
`
`
`
`
`
`
`
`107313-—
`—_
`—_
`
`
`—_
`
`Petitioner Sequenom - Ex. 1004, p. 24
`
`Petitioner Sequenom - Ex. 1004, p. 24
`
`
`
`US 2005/0221341 A1
`
`Oct. 6, 2005
`
`SEQUENCE-BASED KARYOTYPING
`
`RELATED APPLICATIONS
`
`[0001] This application claims the benefit of priority from
`US. Application Nos. 60/513,691 and 60/513,319, both
`filed Oct. 22, 2003. All patents and patent applications
`referenced in this specification are hereby incorporated by
`reference herein in their entireties.
`
`FIELD OF THE INVENTION
`
`[0002] The invention relates to the field of genetics. In
`particular, it relates to the determination of karyotypes of
`genomes of individuals cells and organisms.
`
`BACKGROUND OF THE INVENTION
`
`[0003] Structural rearrangements of chromosomes have
`played a decisive role in the development of abnormalities
`in animals. It is also known that inversions, translocations,
`fusions, fissions, heterochromatin variations and other chro-
`mosomal changes occur as transient somatic or hereditary
`mutation events in natural populations. In human cancer,
`chromosomal changes, including deletion of tumor suppres-
`sor genes and amplification of oncogenes, are hallmarks of
`neoplasia (1). Single copy changes in specific chromosomes
`or smaller regions can result in a number of developmental
`disorders, including Down, Prader Willi, Angelman, and cri
`du chat syndromes (2). Current methods for analysis of
`cellular genetic content
`include comparative genomic
`hybridization (CGH) (3), representational difference analy-
`sis (4), spectral karyotyping/M-FISH (5, 6), microarrays
`(7-10), and traditional cytogenetics. Such techniques have
`aided in the identification of genetic aberrations in human
`malignancies and other diseases (11-14). However, methods
`employing metaphase chromosomes have a limited mapping
`resolution (about 20 Mb) (15) and therefore cannot be used
`to detect smaller alterations. Recent implementation of com-
`parative genomic hybridization to microarrays containing
`genomic or transcript DNA sequences provide improved
`resolution, but are currently limited by the number of
`sequences that can be assessed (16) or by the difficulty of
`detecting certain alterations (9). There is a continuing need
`in the art for methods of analyzing and comparing genomes.
`
`[0004] Traditional karyotyping is usually performed on
`lymphocytes and amniocytes using labor intensive methods
`such as Giemsa staining (G-banding). Because chromo-
`somes are visualized on an optical microscope, the ability to
`resolve detailed mutations (involving only a small part of a
`chromosome) is limited. While more detailed karyotyping
`techniques, such as FISH (fluorescent in situ hybridization)
`are available,
`they rely on specific probes and it is not
`economically or technically feasible to perform FISH on the
`entire chromosome set (i.e., the complete genome).
`
`In recent work, a method was provided for karyo-
`[0005]
`typing a genome of a test eukaryotic cell by generating a
`population of sequence tags after restriction endonuclease
`digestion from defined portions of the genome of a test cell
`(17). This method is not optimal because a small number of
`areas of the genome are expected to have a lower density of
`restriction endonuclease cleavage sites and could be incom-
`pletely evaluated. The authors estimate these areas to
`encompass 5% of a genome. Furthermore, the resolution of
`the method is dependent on the restriction enzyme used and
`
`the method cannot reliably detect very small regions of the
`genome on the order of several thousand base pairs or less.
`
`[0006] Very recently, a new type of human polymorphism
`in genomic DNA has been described, in which small gene
`regions are repeated or deleted (18). These changes, known
`as Copy Number Polymorphisms (CNPs), may account for
`a variety of human disease conditions. New methods of
`analysis will be needed to identify these polymorphisms and
`thereby detect a wide variety of human or animal diseases or
`the traits of any eukaryotic organism including humans,
`non-human animals and plants.
`
`BRIEF SUMMARY OF THE INVENTION
`
`[0007] The current invention provides for a method of
`karyotyping a genome of a test cell (e.g., eukaryotic or
`prokaryotic) by generating a pool of fragments of genomic
`DNA by a random fragmentation method, determining the
`DNA sequence of at least 20 base pairs of each fragment,
`mapping the fragments to the genomic scaffold of the
`organism, and comparing the distribution of the fragments
`relative to a reference genome or relative to the distribution
`expected by chance. The number of a plurality of sequences
`mapping within a given window in the population is com-
`pared to the number of said plurality of sequences expected
`to have been sampled within that window or to the number
`determined to be present in a karyotypically normal genome
`of the species of the cell. A difference in the number of the
`plurality of sequences within the window present in the
`population from the number calculated to be present in the
`genome of the cell indicates a karyotypic abnormality.
`
`[0008] Other embodiments, objects, aspects, features, and
`advantages of the invention will be apparent from the
`accompanying description and claims.
`
`[0009] The present invention provides for a method of
`karyotyping a genome. The genome of the cell is karyotyped
`by randomly fragmenting the DNA from a cell and sequenc-
`ing at least a portion of each fragment. Optimally, at least 20
`base pairs of each fragment is sequenced. For example, the
`DNA is fragmented by an enzyme that cleaves DNA. The
`enzyme cleaves at specific locations within the DNA. Alter-
`natively, the enzyme cleaves the DNA randomly, i.e., non-
`specifically. For example the enzyme is DNase. The DNA is
`cleaved by mechanical method such as sonication or nebu-
`lization. The DNA is sequenced by methods know in the art.
`
`[0010] Preferably, the test cell and the reference cell is
`from the same species. The cell is a eukaryotic cell or a
`prokaryotic cell. The eukaryotic cell a mammalian cell. The
`mammal is, e.g., a human, non-human primate, mouse, rat,
`dog, cat, horse, or cow. The cell
`is a cancer cell, an
`embryonic cell, or a fetal cell. The cell is isolated from
`amniotic fluid or is derived from in vitro fertilization.
`
`Optionally,
`disorder.
`
`the cell
`
`is from a subject with a hereditary
`
`[0011] The plurality of DNA sequences obtained are
`mapped to a genomic scaffold to create a distribution of
`mapped sequence to a region of the genome. At least 1000,
`10,000, 100,000, 1,000,000 or more sequenced are mapped.
`The sequences map to one or more regions in the genome.
`The regions are on the same chromosome. Alternatively, the
`regions are on different chromosomes. The distribution are
`within a contiguous region of the genome. Alternatively, the
`
`Petitioner Sequenom - Ex. 1004, p. 25
`
`Petitioner Sequenom - Ex. 1004, p. 25
`
`
`
`US 2005/0221341 A1
`
`Oct. 6, 2005
`
`are within discontiguous
`distributions
`genome, e.g., on different chromosomes.
`
`regions of
`
`the
`
`[0012] By mapping to a genomic scaffold is meant that the
`sequences are aligned along each chromosome. The test cell
`distribution (i.e., chromosomal map density) is defined as
`the number of mapped sequences (i.e., fragments) by the
`number of possible map locations present in a given chro-
`mosome. The number of possible map locations is defined
`by the size of the observation window and the length of the
`chromosome. No particular length is implied by the term
`observation window. For example, the observation window
`is 25 Mb, 10 Mb, 5 Mb, 4 Mb, 2 Mb, 500 kb, 250 kb, 60 kb,
`30 kb, or 10 kb or less in length.
`
`[0013] The test distribution is compared to a reference
`distribution from a reference cell and an alteration between
`the test distribution and the reference distribution is identi-
`
`fied. The reference distribution can be a database of mapped
`sequences from previously tested cells. Identification of an
`alteration indicates a karyotypic difference between the test
`cell and the reference cell. The alteration is statistically
`significant. By statistically significant
`is meant
`that
`the
`alteration is greater than what might be expected to happen
`by change alone. Statistical significance is determined by
`method known in the art. An alteration is statistically sig-
`nificant if the p-value is at least 0.05. The p-values is a
`measure of probability that a difference between groups
`during an experiment happened by chance. (P(z§zobsewetg).
`For example, a p-value of 0.01 means that there is a 1 in 100
`chance the result occurred by chance. The lower the p-value,
`the more likely it is that the difference between groups is
`caused by a karyotypic difference. Preferably, the p-value is
`0.04, 0.03, 0.02, 0.01, 0.005, 0.001 or less. Alternatively, the
`p-value is 1/24, 1/23 or 1/22 or less.
`
`[0014] The method of the invention is useful in detecting
`aneuploidy. For example, aneuploidy is detected when the
`test distribution to reference distribution is greater than 1.5
`or less than 0.75. However, if the test region and reference
`region is in a sex chromosome and the cells are from a
`subject of the opposite sex. aneuploidy is detected when the
`test distribution to reference region distribution is greater
`than 3.0 or less than 1.5.
`
`[0015] Unless otherwise defined, all technical and scien-
`tific terms used herein have the same meaning as commonly
`understood by one of ordinary skill in the art to which this
`invention belongs. Although methods and materials similar
`or equivalent to those described herein can be used in the
`practice or testing of the present invention, suitable methods
`and materials are described below. All publications, patent
`applications, patents, and other references mentioned herein
`are incorporated by reference in their entirety. In the case of
`conflict, the present specification, including definitions, will
`control. In addition, the materials, methods, and examples
`are illustrative only and not intended to be limiting.
`
`[0016] Other features and advantages of the invention will
`be apparent from the following detailed description and
`claims.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`ing method. Each point represents a chromosome, with
`extreme values representing an extra (>3.0) or the loss (<15)
`of a whole chromosome.
`
`[0018] FIG. 2. 4 Mb resolution fragment density maps
`identifying regions of amplification and deletion. Amplifi-
`cation on chromosome 7. Center panel represents Sequence-
`Based Karyotyping 4 Mb density map as compared to the
`approximately 4 Mb published maps (inset, top right).
`
`[0019] FIG. 3. 4 Mb resolution fragment density maps
`identifying regions of amplification and deletion. Chromo-
`somal content across chromosome 2. Center panel repre-
`sents Sequence-Based Karyotyping 4 Mb density map as
`compared to the approximately 4 Mb published maps (inset,
`top right).
`
`[0020] FIG. 4A. Schematic depicting the methods of the
`invention and various embodiments for these methods.
`
`[0021] FIG. 4B. Schematic depicting exemplary therapeu-
`tic and diagnostic applications for the disclosed methods,
`including infectious disease, oncology, inflammation, and
`disease diagnostics.
`
`[0022] FIG. 5. Schematic depicting exemplary fields for
`use of the disclosed methods,
`including agriculture and
`industry, drugs and diagnostics, bio-defense and public
`health, and academia and government.
`
`[0023] FIG. 6. Schematic depicting an overview of
`sample preparation for the disclosed sequencing methods.
`
`[0024] FIG. 7. Schematic depicting an overview of Par-
`allel SequencingTM.
`
`[0025] FIG. 8. Schematic depicting a comparison method
`used for whole-genome sequencing.
`
`[0026] FIG. 9. Schematic depicting an overview of
`Sequence-Based Karyotyping.
`
`[0027] FIG. 10. Schematic depicting an overview of
`sequence-based gene expression analysis.
`
`[0028] FIG. 11. Schematic depicting an overview of
`genome-wide methylation analysis.
`
`[0029] FIG. 12. Schematic depicting an approach for
`complex-sample sequencing.
`
`[0030] FIG. 13A. Schematic depicting the first and second
`steps for the cell population sequencing method.
`
`[0031] FIG. 13B. Schematic depicting the third through
`seventh step for the cell population sequencing method.
`
`[0032] FIG. 14 Schematic representation of the universal
`adaptor design according to the present
`invention. Each
`universal adaptor is generated from two complementary
`ssDNA oligonucleotides that are designed to contain a 20 bp
`nucleotide sequence for PCR priming, a 20 bp nucleotide
`sequence for sequence priming and a unique 4 bp discrimi-
`nating sequence comprised of a non-repeating nucleotide
`sequence (i.e., ACGT, CAGT, etc.). FIG. 14 depicts a
`representative universal adaptor sequence pair for use with
`the invention. FIG. 14 depicts a schematic representation of
`universal adaptor design for use with the invention.
`
`[0017] FIG. 1. Chromosome Content computed using
`Sequence-Based Karyotyping data is highly correlated with
`previously published estimates using the Digital Karyotyp-
`
`[0033] FIG. 15 Depicts the strand displacement and
`extension of nicked double-stranded DNA fragments
`according to the present invention. Following the ligation of
`
`Petitioner Sequenom - Ex. 1004, p. 26
`
`Petitioner Sequenom - Ex. 1004, p. 26
`
`
`
`US 2005/0221341 A1
`
`Oct. 6, 2005
`
`universal adaptors generated from synthetic oligonucle-
`otides, double-stranded DNA fragments will be generated
`that contain two nicked regions following T4 DNA ligase
`treatment (FIG. 15). The addition of a strand displacing
`enzyme (i.e., Bst DNA polymerase I) will bind nicks (FIG.
`15), strand displace the nicked strand and complete nucle-
`otide extension of the strand (FIG. 15) to produce non-
`nicked double-stranded DNA fragments (FIG. 15).
`
`[0034] FIG. 16 Schematic of one embodiment of a bead
`emulsion amplification process.
`
`[0035] FIG. 17 Schematic of an enrichment process to
`remove beads that do not have any DNA attached thereto.
`
`[0036] FIG. 18 Depicts an insert flanked by PCR primers
`and sequencing primers.
`
`[0037] FIG. 19 Depicts the calculation for primer candi-
`dates based on melting temperature.
`
`[0038] FIG. 20 Depicts the assembly for the nebulizer
`used for the methods of the invention. Atube cap was placed
`over the top of the nebulizer (FIG. 20) and the cap was
`secured with a nebulizer clamp assembly (FIG. 20). The
`bottom of the nebulizer was attached to the nitrogen supply
`(FIG. 20) and the entire device was wrapped in parafilm
`(FIG. 20).
`
`[0039] FIGS. 21A-F Depict an exemplary double ended
`sequencing process.
`
`[0040] FIG. 22 Depiction of jig used to hold tubes on the
`stir plate below vertical syringe pump. The jig was modified
`to hold three sets of bead emulsion amplification reaction
`mixtures. The syringe was loaded with the PCR reaction
`mixture and beads.
`
`[0041] FIG. 23 Depiction of beads (see arrows) suspended
`in individual microreactors according to the methods of the
`invention.
`
`[0042] FIG. 24 Depicts a schematic representation of a
`preferred method of double stranded sequencing.
`
`[0043] FIG. 25 Illustrates the results of sequencing a
`Staphylococcus aureus genome.
`
`[0044] FIG. 26 Illustrates the average read lengths in one
`experiment involving double ended sequencing.
`
`[0045] FIG. 27 Illustrates the number of wells for each
`genome span in a double ended sequencing experiment.
`
`[0046] FIG. 28 Illustrates a typical output and alignment
`string from a double
`ended sequencing procedure.
`Sequences shown in order, from top to bottom: SEQ ID NO:
`12-SEQ ID NO:25.
`
`[0047] For FIGS. 1, 2, and 3, graph values on the Y-axis
`indicate genome copies per haploid genome, and values on
`the X-axis represent position along chromosome.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`interphase chromo-
`puter-generated image. Alternatively,
`somes may be examined as histone-depleted DNA fibers
`released from interphase cell nuclei. In one embodiment, the
`karyotyping methods of this invention are also used to
`determine Copy Number Polymorphisms in a test cell or a
`test genome. Since the Sequence-Based Karyotyping meth-
`ods may be performed on prokaryotic cells, the presence of
`chromosomes is not essential for the methods of the inven-
`tion.
`
`[0049] As used herein, “chromosomal aberration” or
`“chromosome abnormality” refers to a deviation between
`the structure of the subject chromosome or karyotype and a
`normal (i.e., “non-aberrant”) homologous chromosome or
`karyotype. The terms “normal” or “non-aberrant,” when
`referring to chromosomes or karyotypes, refer to the pre-
`dominate karyotype or banding pattern found in healthy
`individuals of a particular species and gender. Chromosome
`abnormalities can be numerical or structural in nature, and
`include aneuploidy, polyploidy,
`inversion,
`translocation,
`deletion, duplication, and the like. Chromosome abnormali-
`ties may be correlated with the presence of a pathological
`condition (e.g., trisomy 21 in Down syndrome, chromosome
`5p deletion in the cri-du-chat syndrome, and a wide variety
`of unbalanced chromosomal rearrangements leading to dys-
`morphology and mental impairment) or with a predisposi-
`tion to developing a pathological condition. Chromosome
`abnormality also refers to genomic abnormality for the
`purposes of this disclosure where the test organism (e.g.,
`prokaryotic cell) may not have a classically defined chro-
`mosome. Furthermore, chromosome abnormality includes
`any sort of genetic abnormality including those that are not
`normally visible on a traditional karyotype using optical
`microscopes, traditional staining, of FISH. One advantage of
`the present invention is that chromosomal abnormality pre-
`viously undetectable by optical methods (e.g., abnormalities
`involving 4 Mb, 600 kb, 200 kb, 40 kb or smaller) can be
`detected.
`
`[0050] As used herein, the term “universal adaptor” refers
`to two complementary and annealed oligonucleotides that
`are designed to contain a nucleotide sequence for PCR
`priming and a nucleotide sequence for sequence priming.
`Optionally,
`the universal adaptor may further include a
`unique discriminating key sequence comprised of a non-
`repeating nucleotide sequence (i.e., ACGT, CAGT, etc.). A
`set of universal adaptors comprises two unique and distinct
`double-stranded sequences that can be ligated to the ends of
`double-stranded DNA. Therefore, the same universal adap-
`tor or different universal adaptors can be ligated to either end
`of the DNA molecule. When comprised in a larger DNA
`molecule that
`is single stranded or when present as an
`oligonucleotide, the universal adaptor may be referred to as
`a single stranded universal adaptor.
`
`“Target DNA” shall mean a DNA whose sequence
`[0051]
`is to be determined by the methods and apparatus of the
`invention. These include a test genome or a reference
`genome.
`
`[0048] The term “karyotype” refers to the genomic char-
`acteristics of an individual cell or cell line of a given species;
`e.g., as defined by both the number and morphology of the
`chromosomes. Typically, the karyotype is presented as a
`systematized array of prophase or metaphase (or otherwise
`condensed) chromosomes from a photomicrograph or com-
`
`[0052] Binding pair shall mean a pair of molecules that
`interact by means of specific non-covalent interactions that
`depend on the three-dimensional structures of the molecules
`involved. Typical pairs of specific binding partners include
`antigen-antibody,
`hapten-antibody,
`hormone-receptor,
`nucleic acid strand-complementary nucleic acid strand, sub-
`
`Petitioner Sequenom - Ex. 1004, p. 27
`
`Petitioner Sequenom - Ex. 1004, p. 27
`
`
`
`US 2005/0221341 A1
`
`Oct. 6, 2005
`
`strate-enzyme, substrate analog-enzyme, inhibitor-enzyme,
`carbohydrate-lectin, biotin-avidin, and virus-cellular recep-
`tor.
`
`the term “discriminating key
`[0053] As used herein,
`sequence” refers to a sequence consisting of at least one of
`each of the four deoxyribonucleotides (i.e., A, C, G, T). The
`same discriminating sequence can be used for an entire
`library of DNA fragments. Alternatively, different discrimi-
`nating key sequences can be used to track libraries of DNA
`fragments derived from different organisms.
`
`[0054] As used herein, the term “plurality of molecules”
`refers to DNA isolated from the same source, whereby
`different organisms may be prepared separately by the same
`method. In one embodiment, the plurality of DNA samples
`is derived from large segments of DNA, whole genome
`DNA, cDNA, viral DNA or from reverse transcripts of viral
`RNA. This DNA may be derived from any source, including
`mammal (i.e., human, nonhuman primate, rodent or canine),
`plant, bird, reptile, fish, fungus, bacteria or virus.
`
`[0055] As used herein, the term “library” refers to a subset
`of smaller sized DNA species generated from a single DNA
`template, either segmented or whole genome.
`
`[0056] As used herein, the term “un