`
`Next-Generation Sequencing: Methodology and Application
`Ayman Grada1 and Kate Weinbrecht2
`Journal of Investigative Dermatology (2013) 133, e11; doi:10.1038/jid.2013.248
`
`inTrOducTiOn
`Nucleic acid sequencing is a method for determining
`the exact order of nucleotides present in a given DNA or
`RNA molecule. In the past decade, the use of nucleic acid
`sequencing has increased exponentially as the ability to
`sequence has become accessible to research and clini-
`cal labs all over the world. The first major foray into DNA
`sequencing was the Human Genome Project, a $3 billion,
`13-year-long endeavor, completed in 2003. The Human
`Genome Project was accomplished with first-generation
`sequencing, known as Sanger sequencing. Sanger sequenc-
`ing (the chain-termination method), developed in 1975
`by Edward Sanger, was considered the gold standard for
`nucleic acid sequencing for the subsequent two and a half
`decades (Sanger et al., 1977).
`Since completion of the first human genome sequence,
`demand for cheaper and faster sequencing methods has
`increased greatly. This demand has driven the develop-
`ment of second-generation sequencing methods, or next-
`generation sequencing (NGS). NGS platforms perform
`massively parallel sequencing, during which millions of
`fragments of DNA from a single sample are sequenced in
`unison. Massively parallel sequencing technology facili-
`tates high-throughput sequencing, which allows an entire
`genome to be sequenced in less than one day. In the past
`decade, several NGS platforms have been developed that
`provide low-cost, high-throughput sequencing. Here we
`highlight two of the most commonly used platforms in
`research and clinical labs today: the LifeTechnologies Ion
`Torrent Personal Genome Machine (PGM) and the Illumina
`MiSeq. The creation of these and other NGS platforms has
`made sequencing accessible to more labs, rapidly increas-
`ing the amount of research and clinical diagnostics being
`performed with nucleic acid sequencing.
`
`OVerVieW OF The meThOdOlOGY
`Although each NGS platform is unique in how sequenc-
`ing is accomplished, the Ion Torrent PGM and the Illumina
`MiSeq have a similar base methodology that includes tem-
`plate preparation, sequencing and imaging, and data analysis
`(Metzker, 2010). Within each generalized step, the individual
`
`WHAT NGS DOES
`• NGS provides a much cheaper and higher-
`throughput alternative to sequencing DNA than
`traditional Sanger sequencing. Whole small
`genomes can now be sequenced in a day.
`• High-throughput sequencing of the human
`genome facilitates the discovery of genes and
`regulatory elements associated with disease.
`• Targeted sequencing allows the identification
`of disease-causing mutations for diagnosis of
`pathological conditions.
`• RNA-seq can provide information on the entire
`transcriptome of a sample in a single analysis
`without requiring previous knowledge of the
`genetic sequence of an organism. This technique
`offers a strong alternative to the use of microarrays
`in gene expression studies.
`
`LIMITATIONS
`• NGS, although much less costly in time and money
`in comparison to first-generation sequencing, is
`still too expensive for many labs. NGS platforms
`can cost more than $100,000 in start-up costs, and
`individual sequencing reactions can cost upward of
`$1,000 per genome.
`• Inaccurate sequencing of homopolymer regions
`(spans of repeating nucleotides) on certain NGS
`platforms, including the Ion Torrent PGM, and
`short-sequencing read lengths (on average 200–500
`nucleotides) can lead to sequence errors.
`• Data analysis can be time-consuming and may
`require special knowledge of bioinformatics to
`garner accurate information from sequence data.
`
`platforms discussed have unique aspects. An overview of the
`sequencing methodologies discussed is provided in Figure 1.
`
`1Department of Dermatology, Boston University School of Medicine, Boston, Massachusetts, USA and 2School of Forensic Sciences, Center for Health Sci-
`ences, Oklahoma State University, Tulsa, Oklahoma, USA
`Correspondence: Ayman Grada, Department of Dermatology, Boston University School of Medicine, 609 Albany Street, Boston, Massachusetts 02118, USA.
`E-mail: grada@bu.edu
`
`© 2013 The Society for Investigative Dermatology
`
`www.jidonline.org
`
`1
`
`00001
`
`EX1036
`
`
`
`
`
`research Techniques made simple research Techniques made simple
`
`Template preparation
`Template preparation consists of building a library of nucleic
`acids (DNA or complementary DNA (cDNA)) and amplify-
`ing that library. Sequencing libraries are constructed by frag-
`menting the DNA (or cDNA) sample and ligating adapter
`sequences (synthetic oligonucleotides of a known sequence)
`onto the ends of the DNA fragments. Once constructed,
`libraries are clonally amplified in preparation for sequencing.
`The PGM utilizes emulsion PCR on the OneTouch system to
`amplify single library fragments onto microbeads, whereas
`the MiSeq utilizes bridge amplification to form template clus-
`ters on a flow cell (Berglund et al., 2011; Quail et al., 2012).
`
`sequencing and imaging
`To obtain nucleic acid sequence from the amplified librar-
`ies, the Ion Torrent PGM and the MiSeq both rely on
`sequencing by synthesis. The library fragments act as a
`template, off of which a new DNA fragment is synthesized.
`The sequencing occurs through a cycle of washing and
`flooding the fragments with the known nucleotides in a
`sequential order. As nucleotides incorporate into the grow-
`ing DNA strand, they are digitally recorded as sequence.
`The PGM and the MiSeq each rely on a slightly different
`mechanism for detecting nucleotide sequence information.
`The PGM performs semiconductor sequencing that relies
`on the detection of pH changes induced by the release of
`a hydrogen ion upon the incorporation of a nucleotide into
`a growing strand of DNA (Quail et al., 2012). By contrast,
`the MiSeq relies on the detection of fluorescence generated
`by the incorporation of fluorescently labeled nucleotides
`into the growing strand of DNA (Quail et al., 2012).
`
`Figure 1. next-generation sequencing methodology.
`
`data analysis
`Once sequencing is complete, raw sequence data must
`undergo several analysis steps. A generalized data analysis
`pipeline for NGS data includes preprocessing the data to
`remove adapter sequences and low-quality reads, mapping
`of the data to a reference genome or de novo alignment of
`the sequence reads, and analysis of the compiled sequence.
`Analysis of the sequence can include a wide variety of bio-
`informatics assessments, including genetic variant calling for
`detection of SNPs or indels (i.e., the insertion or deletion of
`bases), detection of novel genes or regulatory elements, and
`assessment of transcript expression levels. Analysis can also
`include identification of both somatic and germline mutation
`events that may contribute to the diagnosis of a disease or
`genetic condition. Many free online tools and software pack-
`ages exist to perform the bioinformatics necessary to success-
`fully analyze sequence data (Gogol-Döring and Chen, 2012).
`
`applicaTiOns
`The applications of NGS seem almost endless, allowing for
`rapid advances in many fields related to the biological sci-
`ences. Resequencing of the human genome is being per-
`formed to identify genes and regulatory elements involved in
`pathological processes. NGS has also provided a wealth of
`knowledge for comparative biology studies through whole-
`genome sequencing of a wide variety of organisms. NGS
`is applied in the fields of public health and epidemiology
`through the sequencing of bacterial and viral species to facili-
`tate the identification of novel virulence factors. Additionally,
`gene expression studies using RNA-Seq (NGS of RNA) have
`begun to replace the use of microarray analysis, providing
`researchers and clinicians with the ability to visualize RNA
`expression in sequence form. These are simply some of the
`broad applications that begin to skim the surface of what
`NGS can offer the researcher and the clinician. As NGS con-
`tinues to grow in popularity, it is inevitable that there will be
`additional innovative applications.
`
`nGs in pracTice
`Whole-exome sequencing
`Mutation events that occur in gene-coding or control
`regions can give rise to indistinguishable clinical presenta-
`tions, leaving the diagnosing clinician with many possible
`causes for a given condition or disease. With NGS, clini-
`cians are provided a fast, affordable, and thorough way to
`determine the genetic cause of a disease. Although high-
`throughput sequencing of the entire human genome is pos-
`sible, researchers and clinicians are typically interested in
`only the protein-coding regions of the genome, referred to as
`the exome. The exome comprises just over 1% of the genome
`and is therefore much more cost-effective to sequence than
`the entire genome, while providing sequence information for
`protein-coding regions.
`Exome sequencing has been used extensively in the past
`several years in gene discovery research. Several gene dis-
`covery studies have resulted in the identification of genes
`that are relevant to inherited skin disease (Lai-Cheong and
`McGrath, 2011). Exome sequencing can also facilitate the
`
`2
`
`Journal of Investigative Dermatology (2013), Volume 133
`
`© 2013 The Society for Investigative Dermatology
`
`00002
`
`
`
`
`
`research Techniques made simple research Techniques made simple
`
`or clinicians to include specific genomic regions of interest.
`In addition, sequencing panels that target common regions of
`interest can be purchased for clinical use; these include pan-
`els that target hotspots for cancer-causing mutations (Rehm,
`2013). Targeted sequencing—whether of individual genes or
`whole panels of genomic regions—aids in the rapid diagno-
`sis of many genetic diseases. The results of disease-targeted
`sequencing can aid in therapeutic decision making in many
`diseases, including many cancers for which the treatments
`can be cancer-type specific (Rehm, 2013).
`
`QUESTIONS
`Answers are available as supplementary material online
`and at http://www.scilogs.com/jid/.
`
` The basic methodological steps of nGs include
`the following:
`A.
` Template preparation, emulsion PCR, sequencing,
`data analysis.
` Template preparation, sequencing and imaging,
`data analysis.
` Template amplification, sequencing and imaging,
`data analysis.
` Template preparation, sequencing and imaging,
`alignment to a reference genome.
` DNA fragmentation, sequencing, data analysis.
`
`D.
`
`E.
`
`B.
`
`C.
`
`Figure 2. clinical application of whole-exome sequencing in the detection
`of two disease-causing mutations. Reprinted from Cullinane et al., 2011.
`
`1.
`
`identification of disease-causing mutations in pathogenic
`presentations where the exact genetic cause is not known.
`Figure 2 (Cullinane et al., 2011) demonstrates the direct
`effect that NGS can have on the correct diagnoses of a
`patient. It summarizes the use of homozygosity mapping
`followed by whole-exome sequencing to identify two dis-
`ease-causing mutations in a patient with oculocutaneous
`albinism and congenital neutropenia (Cullinane et al., 2011).
`Figure 2a and 2b display the phenotypic traits common to
`oculocutaneous albinism type 4 and neutropenia observed
`in this patient. Figure 2c is a pedigree of the patient’s family,
`both the affected and unaffected individuals. The idiogram
`(graphic chromosome map) in Figure 2d highlights the
`areas of genetic homozygosity. These regions were identi-
`fied by single-nucleotide-polymorphism array analysis and
`were considered possible locations for the disease-causing
`mutation(s). Figures 2e and 2f display chromatograms for the
`two disease-causing mutations identified by whole-exome
`sequencing. Figure 2e depicts the mutation in SLC45A2, and
`Figure 2f depicts the mutation in G6PC3. This case portrays
`the valuable role that NGS can play in the correct diagno-
`sis of an individual patient who displays disparate symptoms
`with an unidentified genetic cause.
`
`Targeted sequencing
`Although whole-genome and whole-exome sequencing are
`possible, in many cases where a suspected disease or con-
`dition has been identified, targeted sequencing of specific
`genes or genomic regions is preferred. Targeted sequencing
`is more affordable, yields much higher coverage of genomic
`regions of interest, and reduces sequencing cost and time
`(Xuan et al., 2012). Researchers have begun to develop
`sequencing panels that target hundreds of genomic regions
`that are hotspots for disease-causing mutations. These pan-
`els target only desired regions of the genome for sequenc-
`ing, eliminating the majority of the genome from analysis.
`Targeted sequencing panels can be developed by researchers
`
`2.
`
` advantages of targeted sequencing as opposed to
`full-genome, exome, or transcriptome sequencing
`include the following:
`
`A.
`
`B.
`
`C.
`
`D.
`
`E.
`
` Affordable and efficient for quickly interrogating
`particular genomic regions of interest.
`
` Provides a deeper coverage of genomic regions
`of interest.
`
` Can be utilized in deciding a therapeutic plan of
`action for both germline and somatic cancers.
`
` Detects and quantifies low-frequency variants
`such as rare drug-resistant viral mutations
`(e.g., HIV, hepatitis B virus, or microbial pathogens).
` All of the above.
`
`3.
`
` applications of nGs in medicine include the following:
`
`A.
`
`B.
`
` Detecting mutations that play a role in diseases
`such as cancer.
`
` Identifying genes responsible for inherited skin
`diseases.
`
`C. Determining RNA expression levels.
`
`D.
`
`E.
`
` Identifying novel virulence factors through
`sequencing of bacterial and viral species.
` All of the above.
`
`© 2013 The Society for Investigative Dermatology
`
`www.jidonline.org
`
`3
`
`00003
`
`
`
`
`
`research Techniques made simple research Techniques made simple
`
`CONFLICT OF INTEREST
`The authors state no conflict of interest.
`
`SUPPLEMENTARY MATERIAL
`Answers and a PowerPoint slide presentation appropriate for journal club
`or other teaching exercises are available at http://dx.doi.org/10.1038/
`jid.2013.248.
`
`REFERENCES
`Berglund EC, Kiialainen A, Syvänen AC (2011) Next-generation sequencing
`technologies and applications for human genetic history and forensics.
`Invest Genet 2:23
`Cullinane AR, Vilboux T, O’Brien K et al. (2011) Homozygosity mapping
`and whole-exome sequencing to detect SLC45A2 and G6PC3 mutations
`in a single patient with oculocutaneous albinism and neutropenia. J
`Invest Dermatol 131:2017–25
`
`Gogol-Döring A, Chen W (2012) An overview of the analysis of next
`generation sequencing data. Methods Mol Biol 802:249–57
`Lai-Cheong JE, McGrath JA (2011) Next-generation diagnostics for inherited
`skin disorders. J Invest Dermatol 131:1971–3
`Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev
`Genet 11:31–46
`Quail MA, Smith M, Coupland P et al. (2012) A tale of three next generation
`sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and
`Illumina MiSeq sequencers. BMC Genom 13:341
`Rehm HL (2013) Disease-targeted sequencing: a cornerstone in the clinic.
`Nat Rev Genet 14:295–300
`Sanger F, Nicklen S, Coulson AR (1977) DNA sequencing with chain-
`terminating inhibitors. Proc Natl Acad Sci USA 74:5463–7
`Xuan J, Yu Y, Qing T et al. (2012) Next-generation sequencing in the clinic:
`promises and challenges. Cancer Lett; e-pub ahead of print 19 November 2012
`
`4
`
`Journal of Investigative Dermatology (2013), Volume 133
`
`© 2013 The Society for Investigative Dermatology
`
`00004
`
`
`
`JourNAL or INVESTIGATIVE DERMATOLOGY
`
`Theofficial journal of The Society for Investigative Dermatology and European Society for Dermatological Research
`
`Volume 133 Number 8 August 2013
`
`Editor
`Barbara A. Gilchrest, Boston, MA
`
`Advisory Board
`Paul R. Bergstresser, Dallas, TX
`Lowell A. Goldsmith, Chapel Hill, NC
`Erwin Tschachler, Vienna, Austria
`
`Deputy Editors
`Angela M. Christiano, New York, NY
`Thomas Ruenger, Boston, MA
`Thomas Werfel, Hannover, Germany
`
`Section Editors
`Masayuki Amagai, Tokyo, Japan
`Lisa Beck, Rochester, NY
`Vladimir Botchkarev, Bradford, UK
`Paul E. Bowden, Cardiff, UK
`Richard Clark, Stony Brook, NY
`Meenhard Herlyn, Philadelphia, PA
`Sam Hwang, Milwaukee, WI
`Ethan A. Lerner, Boston, MA
`W. H. Irwin McLean, Dundee, UK
`TamarNijsten, Rotterdam, The Netherlands
`Thomas Schwarz, Kiel, Germany
`John R, Stanley, Philadelphia, PA
`Robert Swerlick, Atlanta, GA
`Jouni Uitto, Philadelphia, PA
`Hywel Williams, Nottingham, UK
`Stuart Yuspa, Bethesda, MD
`Statistical Editor
`Beverley Adams-Huet, Dallas, TX
`
`
`
`/ID Connector Editor
`Kavitha Reddy, New York, NY
`Meeting Reports Editor
`Gerald S. Lazarus, Baltimore, MD
`
`Milestones Editor
`Hensin Tsao, Boston, MA
`
`Podcast Editors
`Robert Dellavalle, Denver, CO
`W.H. Irwin McLean, Dundee, UK
`Research Techniques Made Simple Editor
`Kathryn Schwarzenberger, Burlington, VT
`Medical Writer
`Heather Yarnall Schultz, Huntington, WV
`Managing Editor
`Elizabeth Nelson Blalock, Chapel Hill, NC
`
`Editorial Assistant
`Albert Luong, Chapel Hill, NC
`Editors Emeriti
`Marion B. Sulzberger, 1938-1949
`Naomi M. Kanof, 1949-1967
`Richard B. Stoughton, 1967-1972
`Irwin M., Freedberg, 1972-1977
`Ruth K. Freinkel, 1977-1982
`Howard P. Baden, 1982-1987
`David A. Norris, 1987-1992
`Edward J. O'Keefe, 1992-1997
`Conrad Hauser, 1997-2002
`Lowell A. Goldsmith, 2002-2007
`Paul R. Bergstresser, 2007-2012
`
`Associate Editors
`Rhoda M. Alani, Boston, MA
`AndrewE. Alpin, Philadelphia, PA
`Martine Bagot, Paris, France
`Boris Bastian, San Francisco, CA
`Jurgen Becker, Graz, Austria
`Mark Berneburg, Tubingen, Germany
`Tilo Biedermann, Tubingen, Germany
`WendyB. Bollag, Augusta, GA
`Luca Borradori, Berne, Switzerland
`Jan Nico BouwesBavinck, Leiden,
`The Netherlands
`Joke Bouwstra, Leiden, The Netherlands
`Leena Bruckner-Tuderman,Freiburg, Germany
`Mary-Margaret Chren, San Francisco, CA
`Cheng-Ming Chuong, Los Angeles, CA
`Rachael A. Clark, Boston, MA
`Thomas N. Darling, Bethesda, MD
`Jeffrey M. Davidson, Nashville, TN
`Mitchell F. Denning, Chicago,IL
`Richard L. Eckert, Baltimore, MD
`Tatiana Efimova, St. Louis, MO
`JamesT. Elder, Ann Arbor, MI
`Alexander H. Enk, Heidelberg, Germany
`Kenneth Feingold, San Francisco, CA
`David E. Fisher, Boston, MA
`GaryJ. Fisher, Ann Arbor, MI
`Carsten Flohr, London, UK
`
`Richard Gallo, San Diego, CA
`Luis A. Garza, Baltimore, MD
`Michel F. Gilliet, Lausanne, Switzerland
`Michael Girardi, New Haven, CT
`Matthias Goebeler, Wiirzburg, Germany
`Kathleen J. Green, Chicago,IL
`Alain Hovnanian, Paris, France
`Alan D.Irvine, Dublin, lreland
`Rivkah Isseroff, Davis, CA
`Kenji Kabashima, Kyoto, Japan
`Veli-Matti Kahari, Turku, Finland
`Sarolta K. Karpati, Budapest, Hungary
`Kenneth A. Katz, San Francisco, CA
`Andrew P. Kowalezyk, Atlanta, GA
`ThomasKrieg, Cologne, Germany
`Molly Kulesz-Martin, Portland, OR
`Robert M. Lavker, Chicago, IL
`Martin Leverkus, Mannheim, Germany
`David Margolis, Philadelphia, PA
`Alain Mauviel, Paris, France
`John McGrath, London, UK
`Paul Nghiem, Seattle, WA
`AmyS. Paller, Chicago, IL
`AndreyA. Panteleyev, Moscow, Russia
`Kyoung ChanPark, Seoul, South Korea
`Vincent Piguet, Cardiff, UK
`Carlo Pincelli, Modena, Italy
`00005
`.
`
`Abrar A. Qureshi, Boston, MA
`Dennis Roop, Denver, CO
`Sarbjit S. Saini, Baltimore, MD
`Yardena Samuels, Bethesda, MD
`Martin Schmelz, Heidelberg, Germany
`Jochen Schmitt, Dresden, Germany
`Glynis Scott, Rochester, NY
`Julia A. Segre, Bethesda, MD
`Vijayasaradhi Setaluri, Madison, WI
`Jan C. Simon, Leipzig, Germany
`Fli Sprecher, Tel Aviv, Israel
`Richard Spritz, Denver, CO
`Phyllis |. Spuls, Amsterdam, The Netherlands
`Robert S, Stern, Boston, MA
`John P. Sundberg, Bar Harbor, ME
`JeanY. Tang, San Francisco, CA
`Marjana Tomic-Canic, Miami, FL
`Sergey M. Troyanovsky, Chicago, IL
`Mark C. Udey, Bethesda, MD
`Baoxii Wang,Beijing, China
`Xiao-Jing Wang, Denver, CO
`Nicole L. Ward, Cleveland, OH
`Wendy Weinberg, Bethesda, MD
`Giovanna Zambruno, Rome,Italy
`Xuejun Zhang, Heifei, China
`Xue Zhang,Beijing, China
`Detlef Zillikens, Lubeck, Germany
`
`00005
`
`
`
`| journal ofinvestigative dermatology.
`(Aug. 2013)
`+
`ction
`
`'
`
`Volume 133 Number 8 August 2013 www.jidonline.org
`
`'
`
`___.3ATIVE DERMATOLOGY
`
`5:38:49
`
`Strengthening Connections between
`
`=
`==
`=
`=
`—
`
`=
`:
`=
`=
`
`fo
`
`ome,
`2552%
`aooeee _
`ogigebed
`Sze = F
`
`8
`
`&3/
`as
`g
`:
`
`2013: The Year of Unity
`
`ot
`baat
`Laboratory-Based Investigation and Clinical Care
`
`PROPERTY OF THE
`NATIONAL
`LIBRARY OF
`MEDICINE
`
`GOh
`tf
`
`00006
`
`00006
`
`