throbber
From:
`To:
`Subject:
`Date:
`Attachments:
`
`John West
`Richard Chen; Hugo Lam; Mark Pratt; Scott Kirk
`Spreadsheet for 2pm discussion of Accuracy Differentiation
`Tuesday, May 22, 2012 1:17:12 PM
`Accuracy differentiation JW 22May2012.xls
`
`All,
`
`For our meeting this afternoon on accuracy differentiation, I have put together some notes to
`serve as a starting point.
`They are attached in Excel. Comments are welcome.
`
`John
`
`Personalis EX2054.1
`
`

`

`ACCURACY DIFFERENTIATION
`Draft for discussion, JW, 22 May, 2012
`
`ORGANIZATION OF THIS FILE
`Company differentiation around accuracy
`Better sequencing in the laboratory
`Better variant detection & reporting
`More accurate databases, for interpretation
`
`COMPANY DIFFERENTIATION AROUND ACCURACY
`Focus on accuracy relevant to medical interpretation
`Better understanding of the issues than anyone
`Best track record of publications on the issue
`Unbiased from a platform standpoint & able to combine platforms
`More comprehensive data on accuracy than anyone
`World's only collection of genomes sequenced on both ILMN and CG platforms, plus arrays and karyotyping
`Largest family pedigree sequenced to high coverage
`Only genome sequenced on Sanger, ILMN and CGI
`Databases more accurate than those publicly available
`Able to provide a detailed quantitative view of mechanisms underlying errors
`Deep understanding of accuracy issues used to create better results
`Not just flagging errors, or filtering out those loci, but fixing the problems
`More insightful approaches delivery accuracy affordably
`
`BETTER SEQUENCING IN THE LABORATORY (WHEN MANAGED BY PERSONALIS)
`
`Focus on getting the whole medically-interpretable genome, accurately, even if more expensive
`Use insight into error types & medical content to keep this affordable
`
`Combine data from multiple different runs of a single platform
`Combine paired-end libraries made with multiple insert lengths
`Use longer read lengths (e.g. 2 x 250 bases, when available later in 2012)
`More expensive because only MiSeq, but clearly better
`Combine with bulk shorter-read data from HiSeq
`Substantially more efficient at split-read & junction-sequence SV detection
`Key to single-base breakpoint determination
`
`Combine data from multiple platforms
`More experience with Illumina / Complete Genomics than anyone
`May add Ion Torrent, Oxford Nanopore
`Guided by deep understanding of differential error mechanisms in each platform
`Not tied to any one platform
`Use whatever it takes to get the best possible combination
`
`Combine data from outside next-gen sequencing
`
`Several major areas of medical genetics are not well assayed by next gen sequencing
`Example 1 : Diseases caused by STR-expansion (e.g. Huntinton's)
`Example 2 : Robertsonian translocations
`
`Personalis will combine NGS data with Non-NGS technologies to create a complete assessment
`Add karyotyping
`Add electrophoresis where appropriate (TBD)
`Others
`
`Orthogonal technologies also provide validation of NGS results
`Integrate NGS with array (fluorescence for SV's, in addition to genotypes)
`Sanger follow-up to findings of specific genomes (option TBD)
`
`Ability to create semi-custom products focused on medically interpretable parts of the genome
`Leverages Personalis advantages in content
`Custom hybridization array
`Custom pullout set
`Other assays to fix specific error types
`
`Question : Should there be a "Personalis exome" option ?
`More comprehensive / accurate at exome price level ?
`
`What is proprietary about this approach ?
`
`Personalis EX2054.2
`
`

`

`Personalis can focus it's efforts based on the world's best content (re medical interpretation)
`Personalis will develop proprietary understanding about how best to combine multiple technologies
`Personalis does not face the competitive & anti-trust barriers that platform companies do
`Personalis' people combine deep experience with both platforms and interpretation & can leverage the two against each other
`Personalis can combine work in the lab and in bioinformatics, in a way that pure informatics companies can't
`
`BETTER VARIANT DETECTION & REPORTING
`
`Fewer false positive SNP's due to method of generating laboratory data
`Combination of paired-end insert lengths covers more of the genome uniquely
`
`Fewer false negative SNP's due to method of generating laboratory data
`More uniform coverage by combining library prep methods & platforms
`
`Orthogonal validation of millions of SNP genotypes by array
`Integrated with next gen sequencing data, not just another separate report
`
`Better alignment, due to better reference sequence
`SNP major allele ref by ethnicity (in our first product)
`InDel major allele ref by ethnicity (later)
`Other advances as R&D develops them:
`Itterative alignment
`
`# TBD changes inSNP alleles called with Personalis reference vs public standard
`Likely more improvement in non-European ethnicities
`Include (eventually) changes in InDels called as well
`
`We provide the only support available specifically for admixed genomes
`Major-allele non-ethnic reference, or even more advanced options
`
`Focused effort to align in the presence of SNP / InDel clusters, MNP's
`May leverage Hugo's BreakSeq approach
`May need time in the development plan
`
`Better SNP reporting due to better reference
`
`We report variants when sequence is a homozygous match to the public ref but that's the minor allele
`Entirely missed by systems which use the public reference
`Example : Factor V Leiden
`Rong had a whole paper on all the disease variants in the public ref
`> 1M loci where we can be different
`We should calculate the average # actual loci / genome, by ethnicity
`
`At het loci, we report both alleles, but we report the minor allele as the variant
`Not the allele which is different from the public reference
`
`Better detection of SV's
`
`Better lab data for SV detection:
`Longer reads (better for approaches based on split-read & junction-sequences)
`MiSeq 2x250 or other platform
`Multiple insert lengths
`Electrophoretic assay of STR-expansions
`Karyotyping for Robertsonian translocations
`
`Orthogonal technologies for validation of SV's:
`Fluorescence intensity data from hybridization arrays
`
`We combine the results from five different algorithmic approaches
`
`We test our SV algorithms by Mendelian Inheritance in high coverage whole genome family data sets
`one which was sequenced with ten different paired-end libraries spanning 200 - 40,000 bases
`and validate them using fluorescent intensities from high density hybridization arrays
`
`We don't treat all SV's as novel - we have the world's best database of known SV's and their junction sequences
`Detection is better when you know exactly what you are looking for
`We should have a meeting to discuss how we can (easily ?) build this
`Start with 1,000 Genomes result Hugo has helped create
`Large data set but low coverage may make detection less certain in low MAF SV's
`Others will be able to access this eventually, potentially catching up, or claiming to
`Augment this with (more confident ?) SV's from:
`
`Personalis EX2054.3
`
`

`

`Full coverage (30-40x) genomes (West, Altman, 40 Koreans, others we can download)
`High coverage (>60x) genomes (Snyder, CEPH1463, Venter, others ?)
`
`Better reporting of SV's
`We determine the zygocity of deletions and report it
`Deletions integrated with SNP report, e.g. "A-" vs "AA" inside a het deletion
`We report SV's with their allele frequencies in the ethnicity matching the sample
`
`Flagging of potential errors
`Many subtle error types not recognized by others
`Error mechanisms underlying differences when the same person is sequenced twice (it's not just Poisson !)
`Error loci determined from deep & multi-platform sequencing of large families
`Error loci determined by extensive platform comparison, both NGS/NGS and NGS/Non-NGS
`Detailed understanding of compressions, and large unpublished catalog of them
`
`MORE ACCURATE DATABASES, FOR INTERPRETATION
`
`Cleaner databases:
`Well financed, systematic manual curation to industrial QC standard
`Standardized medical language hierarchy
`Extensive cross checking of databases developed independently
`VariMed vs HGMD
`MendelDB vs OMIM
`Personalis PharmGKB vs public PGKB (need to be careful in this positioning)
`
`Databases others will not have:
`Regulome
`BreakSeq (esp if augmented with private Personalis data)
`Compression list (described in publications but not released)
`Variant data derived from a broad collection of genomes
`Multiple public data sets, some processed in proprietary ways by Personalis
`Access to private data sets, sequenced by others
`Access to private data sets, sequenced by Personalis
`
`Personalis EX2054.4
`
`

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket