`kind of national protection available): AE, AG, AL, AM,
`AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ,
`CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO,
`DZ, EC, EE, EG, ES, Fl, GB, GD, GE, GH, GM,GT,
`HN, HR, HU, ID,IL,IN,IS, JP, KE, KG, KM,KN,KP,
`KR, KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD,
`ME, MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI,
`NO, NZ, OM,PE, PG, PH, PL, PT, RO, RS, RU, SC, SD,
`SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR,
`TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
`(84) Designated States (unless otherwise indicated, for every
`kind of regional protection available): ARIPO (BW, GH,
`GM, KE, LR, LS, MW, MZ, NA, SD,SL, SZ, TZ, UG,
`ZM,ZW), Eurasian (AM, AZ, BY, KG, KZ, MD, RU,TJ,
`TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK,
`(71) Applicant (for all designated States except US): THE
`EE, ES, FI, FR, GB, GR, HR, HU,IE,IS, IT, LT, LU,
`REGENTS OF THE UNIVERSITY OF CALIFOR-
`LV, MC, MK, MT, NL, NO, PL, PT, RO, SE, SL, SK,
`NIA [US/US]; 1111 Franklin Street, 5th Floor, Oakland,
`SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ,
`CA 94607-5200 (US).
`GW, ML, MR,NE, SN, TD, TG).
`(72) Inventors; and
`(75) Inventors/Applicants (or US only): DING, Shou-Wei Published:
`[US/US]; 8797 Barnwood Lane, Riverside, CA 92508
`—__without international search report and to be republished
`(US). WU, Qingfa [CN/US]; 1177 Linden Street, Apt.
`upon receipt of that report (Rule 48. 2(g))
`19, Riverside, CA 92507 (US).
`—___with sequencelisting part ofdescription (Rule 5.2(a))
`(74) Agents: EINHORN, Gregory, P. et al.; Gavrilovich,
`Dodd & Lindsey LLP, 4660 La Jolla Village Drive, Suite
`750, San Diego, CA 92122 (US).
`
`(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT)
`OUT AUIAAOMA AAA
`
`(19) World Intellectual Property Organization
`International Bureau
`
`9 December 2010 (09.12.2010)
`
`(43) International Publication Date
`
`(10) International Publication Number
`WO 2010/141433 A2
`
`(51) International Patent Classification:
`C12Q 1/68 (2006.01)
`C40B 40/06 (2006.01)
`C12@Q 1/70 (2006.01)
`ge
`.
`+
`,
`.
`(21) International Application Number:
`PCT/US2010/036849
`
`(22) International Filing Date:
`
`(25) Filing Language:
`(26) Publication Language:
`(30) Priority Data:
`61/183,377
`61/286,742
`
`1 June 2010 (01.06.2010)
`:
`English
`English
`
`2 June 2009 (02.06.2009)
`15 December 2009 (15.12.2009)
`
`US
`US
`
`(54) Title: VIRUS DISCOVERY BY SEQUENCING AND ASSEMBLY OF VIRUS-DERIVED SIRNAS, MIRNAS, PIRNAS
`
`(57) Abstract: In one embodiment, the disclosure provides methods and systems for identifying viral nucleic acids in a sample. In
`another embodiment the invention provides methods for viral genome assembly and viral discovery using small inhibitory RNAs,
`or "small silencing," RNAs (siRNAS), micro-RNAs (miRNAs) and/or PIWI-interacting RNAs (piRNAs), including siRNAS, miR-
`NAsand/or piRNAsisolated or sequenced from invertebrate organisms such as insects (Anthropoda), nematodes (Nemapoda),
`Mollusca, Porifera, and other invertebrates, and/or plants, fungi or algae, Cyanobacteria andthe like.
`
`
`
`
`
`
`
`wo2010/141433A2IMITINMINIIITNIACINTITIAN
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`VIRUS DISCOVERY BY SEQUENCING AND ASSEMBLY
`OF VIRUS-DERIVEDsiRNAs, miRNAs, piRNAs
`
`REFERENCE TO SEQUENCELISTING
`
`This application contains a txt. File containing the sequencelisting, which is
`
`incorporated by reference herein.
`
`STATEMENT AS TO FEDERALLY SPONSORED RESEARCH
`
`This invention was made with government support under Grant No. AI052447
`
`awarded by the National Institutes of Health (NIH) and Grant No. 2007-353 19-18325
`
`awarded by the USDA. The governmenthascertain rights in the invention.
`
`TECHNICAL FIELD
`
`In one embodiment, the disclosure provides methods and systems for
`
`identifying viral nucleic acids in a sample. In another embodimentthe invention
`
`provides methods for viral genome assembly andviral discovery using small
`
`inhibitory RNAs, or “small silencing,” RNAs (siRNAS), micro-RNAs (miRNAs)
`
`and/or PIWI]-interacting RNAs (piRNAs), including siRNAS, miRNAsand/or
`
`piRNAsisolated or sequenced from invertebrate organisms such as insects
`
`(Anthropoda), nematodes (Nemapoda), Mollusca, Porifera, and other invertebrates,
`
`and/or plants, fungi or algae, Cyanobacteria and the like.
`
`BACKGROUND
`
`Discovery of new virusesis often hindered by difficultics in their
`
`amplification in cell culture and/or lack of their cross-reactivity in serological and
`
`nucleic acid hybridization assays to known viruses. Many new viruses have been
`
`recently identified in environmental and clinical samples using metagenomic
`
`approaches,in whichviral particles are first purified and viral nucleic acid sequences
`
`are then randomly amplified prior to subcloning and sequencing (Delwart, 2007).
`
`The Dicer family of host immune receptors mediates antiviral immunity in
`
`fungi, plants and invertebrate animals by RNA interference (RNAi) or RNA silencing
`
`(1-3). In this immunity, a viral double-stranded RNA (dsRNA) is recognized by Dicer
`
`
`
`and diced into small interfering RNAs (siRNAs). These virus-derived siRNAsare
`
`then loaded into an RNAsilencing complex to act as specificity determinants and to
`
`1
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`guide slicing of the target viral RNAs by an Argonaute protein (AGO)present in the
`
`complex. Dicer proteins typically contain an RNA helicase domain, a PAZ domain
`
`shared with AGOs, and two tandem type III endoribonuclease (RNaseIII) domains.
`
`Dicer cleaves dsRNA with a simple preference toward a terminus of dsRNA,
`
`producing duplex small RNA fragments of discrete sizes progressively from the
`
`terminus (4).
`
`In addition to siRNAs, microRNAs (miRNAs) and PIWI-interacting RNAs
`
`(piRNAs) also guide RNAsilencing in similar complexes but with distinct AGOs (4-
`
`6). In Drosophila melanogaster, miRNAs and siRNAsare predominantly 22 and 21
`
`nucleotides in length, dependent on Dicer-1 (DCR1) and DCR2 for their biogenesis,
`
`and act in silencing complexes containing AGO1 and AGO2 in the AGO subfamily,
`
`respectively (4-6). In contrast, ~24-30-nt piRNAsare Dicer-independentand require
`
`AGO3, Aubergine (AUB) and PIWI in the PIWI subfamily for their biogenesis (4-6).
`
`Genetic analyses (7-10) have clearly demonstrated a role for D. melanogaster DCR2
`
`in the immunity and biogenesis of viral siRNAstargeting diverse positive-strand (+)
`
`RNAviruses, including Flock house virus (FHV), cricket paralysis virus, Drosophila
`
`C virus (DCY), and Sindbis virus (SINV). Cloning and sequencing of small RNAs
`
`from FHV-infected Drosophila cells further indicate that the viral dsRNA replicative
`
`intermediates (vRI-dsRNA)are the substrate of DCR2 and the precursorofviral
`
`siRNAs(11-12). Drosophila susceptibility to Drosophila X virus (DXV), which
`
`contains a dsRNA genome,is influenced by components from both the siRNA(e.g.,
`
`AGO? & R2D2) and piRNA (e.g., AUB & PIWI) pathways (13). However, detection
`
`of small RNAsderived from any dsRNAvirus has not been reported yet (1, 13).
`
`Virus-derived small RNAswerefirst detected in plants infected with a +RNA
`
`virus (14). The Dicer proteins involved in the production of siRNAstargeting both
`
`+RNA viruses and DNA viruses have been identified in Arabidopsis thaliana (2-3),
`
`which encode AGOsin the AGO subfamily but not in the PIWI subfamily (15).
`
`Cloning and sequencing ofplant viral siRNAs suggest that they may be processed
`
`either from vRI-dsRNAorhairpin regions of single-stranded RNA precursors (16-
`
`20). Production of viral siRNAshas also been demonstrated in fungi, silkworms,
`
`mosquitoes, and nematodesin response to infection with +RNA viruses and viral
`
`small silencing RNAsproduced in fungi and mosquitoes have recently been cloned
`
`and sequenced (21-25).
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`The available data thusillustrate that accumulation of virus-derived small
`
`silencing RNAsis a commonfeature of an active immuneresponseto viral infection
`
`in diverse eukaryotic host species.
`
`SUMMARY
`
`The disclosure provides a method for viral discovery that is independent of
`
`either amplification or purification of viral particles. Many humandiseases such as
`
`approximately half of all analyzed cases of human encephalitis and gastroenteritis,
`
`have no identified etiology. Thus, discovery of new viruses should facilitate
`
`identification of human pathogenic viruses, improve the understandingof their
`
`transmission and provide diagnostic tools and targets for the developmentofanti-
`
`virals.
`
`The disclosure provides methodsofidentifying viral nucleic acid, assembling
`
`viral genomesand discovering viruses based upon the mechanism ofinvertebrate,
`
`plant, algae, fungal etc. processing of viral small inhibitory RNAs, or “small
`
`silencing,” RNAs (siRNAS), micro-RNAs (miRNAs) and/or PIW]-interacting RNAs
`
`(piRNAs), including miRNA-, piIRNA-, siRNA and/or RNAi-mediated viral
`
`immunity in plants and invertebrates, including insects (Drosophila melanogaster and
`
`mosquitoes) and nematodes (Caenorhabditis elegans), and algae, fungus,
`
`10
`
`15
`
`20
`
`Cyanobacteria and thelike.
`
`In alternative embodiments, the invention provides methods comprising:
`
`(a) (i) obtaining a plurality of naturally occurring 18-28 nucleotide RNA
`
`fragments, or siRNAs, or miRNAsand/or piRNAs, to generate an RNAlibrary, or,
`
`obtaining a plurality of 18-28 nucleotide RNA fragments, or siRNAs, or miRNAs
`
`25
`
`and/or piRNAsfrom an organism or organisms, or a plant or plants; and
`
`(11)
`
`determining the sequence of the RNA fragments, or siRNAs, or
`
`
`
`miRNAsand/or piRNAs, and using those sequences to assemble the RNA fragments,
`
`or siRNAs, or miRNAs and/or piRNAs into at least one contiguous unit (‘a contig’’)
`
`comprising a plurality of the nucleotide RNA fragments siRNAs, miRNAsand/or
`
`30
`
`piRNAs; or
`
`(b) the method of (a), wherein the contigs are assembled using the help of a
`
`computer program, wherein optionally the computer program is VELVET.
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`In alternative embodiments, the methods further comprise determining the
`
`sequence of the assembled contiguous unit, or the contig; or further comprise:
`
`(a) searching a databaseof viral or microorganism sequencesusingthe at least
`
`one contiguous sequenceto identify a viral or microorganism genome,nucleic acid or
`
`protein-encoding sequence, or subsequencethereof, having significant homology to
`
`the assembled contiguous unit; or
`
`(b) the method of (a), wherein the database comprises non-redundant
`
`nucleotide sequences; or
`
`(c) the method of (a), wherein the database comprises in silico translation
`
`10
`
`sequences.
`
`wherein optionally the assembled contig sequence hassignificant homology to
`
`a knownviral genus or genome.
`
`In alternative embodiments, the methods further comprise searching a
`
`database of viral or microorganism sequences using the at least one contiguous
`
`sequenceto identify a viral or microorganism genome,nucleic acid or protein-
`
`encoding sequence, or subsequence thereof, having at least about 50% to about 100%
`
`percent homologyto all or part of the assembled contiguous sequence.
`
`In alternative embodiments, the methods further comprise making a
`
`phylogenetic analysis of the identified viral or microorganism genome,nucleic acid or
`
`protein-encoding sequence with the contiguous sequence.
`
`In alternative embodiments, the methods further comprise identifying and
`
`annotating the phylogenetic analysis of the identified viral sequence with the
`
`contiguous sequence.
`
`In alternative embodiments, the obtained RNA or nucleotide sequences are
`
`substantially purified or isolated from an organism ofinterest.
`
`In alternative embodiments, the methods further comprise substantially
`
`purifying small RNA fragments, or siRNAs, or miRNAs and/or piRNAs, from an
`
`organism of interest and sequencing the RNA fragments to obtain an RNAlibrary.
`
`In alternative embodiments, the methods further comprise removing
`
`sequenced segments from the library that overlap with the genomic sequence of the
`
`organism of interest from which the RNA wasderived.
`
`In alternative embodiments, the methods further comprise filling in gaps
`
`between the contiguous sequences. In one embodiment,filling in the gaps between
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`the contiguous sequences comprises use of RT-PCR and/or sequencing to fill in gaps
`
`between the contiguous sequences.
`
`In alternative embodiments, the methods further comprise completing a
`
`genomic sequenceof a virus or a microorganism comprising the contiguous sequence
`
`using 5’-RACEand 3’-RACE.
`
`In one embodiment, the organism or organismsis/are an invertebrate, an insect
`
`(Anthropoda), a nematode (Nemapoda), a Mollusca, a Porifera, a plant, a fungi, an
`
`algae, a Cyanobacteria; or the organism or organismsare identified or unidentified
`
`and are derived from an environmental sample. In one embodiment, the
`
`10
`
`environmental sample is a soil sample, a water sample or an air sample.
`
`In one embodiment, the invention provides methods for identifying a virus,
`
`comprising:
`
`constructing a small RNAlibrary from an organism or organisms;
`
`deep sequencing the small RNAlibrary;
`
`assembling the sequenced small RNAsusing(a) all of the sequenced small
`
`RNAsof 18-28 nucleotides in length, or siRNAs, or miRNAsand/or piRNAs; or (b)
`
`small RNAs, or siRNAs, or miRNAs and/or piRNAs,of a defined length into at
`
`plurality of contigs;
`
`identifying and removing those assembled sequences mapped onto the genome
`
`of the organism to provide an enriched set of contigs;
`
`performing a homology search of contigs against knownviruses at both the
`
`nucleotide and protein levels;
`
`optionally using RT-PCR and sequencingto fill the gaps between the contigs
`
`that show limited similarities with a knownvirus;
`
`completing the full-length genomic sequenceof the identified virus with 5’-
`
`RACEand 3’-RACE;and
`
`annotating the identified virus.
`
`In one embodiment, the organism or organismsis/are an invertebrate, an insect
`
`(Anthropoda), a nematode (Nemapoda), a Mollusca, a Porifera, a plant, a fungi, an
`
`algae, a Cyanobacteria; or the organism or organismsare identified or unidentified
`
`and are derived from an environmental sample. In one embodiment, the
`
`environmental sample is a soil sample, a water sample or an air sample.
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`The details of one or more embodimentsof the disclosure are set forth in the
`
`accompanying drawingsandthe description below. Other features, objects, and
`
`advantages of the disclosure will be apparent from the description and drawings, and
`
`from the claims.
`
`All publications mentioned herein are incorporated herein by reference in full
`
`for the purpose of describing anddisclosing the methodologies, which are described
`
`in the publications, which might be used in connection with the description herein.
`
`The publications discussed above and throughout the text are providedsolely for their
`
`disclosure priorto the filing date of the present application. Nothing herein is to be
`
`construed as an admission that the inventors are not entitled to antedate such
`
`disclosure by virtue of prior disclosure.
`
`DESCRIPTION OF DRAWINGS
`
`Figure 1A and Figure B showsdistribution of assembled viral siRNA contigs
`
`on the tripartite and bipartite RNA genomeof (a) CMV and (b) FHV; as discussed in
`
`detail in Example 1, below.
`
`Figure 2 showsthe discovery of three new viruses by assembly ofviral
`
`siRNAs. Total of 54, 34 and 19 contigs assembled from sequenced siRNAs were
`
`mapped to DmTV, DmBV and DmTRYV,respectively. The genome organization of
`
`EEV was shownas a reference for DmTRV. % protein sequence identities of
`
`assembled contigs (red bars) of the three viruses to related viruses were shown on the
`
`top or below;as discussed in detail in Example 1, below.
`
`Figure 3A, Figure 3B, Figure 3C illustrate a phylogenetic analysis of newly
`
`identified viruses (indicated by a red arrow) according to the similarities of viral
`
`RdRPs with Clustal W method; as discussed in detail in Example 1, below.
`
`Figure 4A and Figure 4Billustrate the distribution of assembled viral siRNA
`
`contigs on the monopartite genome and bipartite RNA genome of SINV and a new
`
`nodavirus respectively; as discussed in detail in Example 1, below.
`
`Figure 5 illustrates position and distribution of FHV and SINV siRNA contigs
`
`assembled from small RNAs sequenced from (Figure 5A) Drosophila S2 cells
`
`infected with the B2-deletion mutant of FHV (11), (Figure 5B) a transgenic C.
`
`elegans strain in the RNAi-defective 1 (rde-1) mutant background carrying an FHV
`
`RNAI replicon in which the coding sequence of B2 wasreplaced by that of GFP (29),
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`and (Figure 5C) adult mosquitoes infected with SINV (22); as discussed in detail in
`
`Example 2, below.
`
`Figure 6 illustrates discovery of dsRNA viruses DTV (Figure 6A) and DBV
`
`(Figure 6B), and +RNA viruses DTrV (Figure 6C) and MNV (Figure 6D) from S2-
`
`GMRcells by vsSAR; as discussed in detail in Example 2, below.
`
`Figure 7 illustrates the S2-GMRcells contained four infectious RNA viruses:
`
`Figure 7A illustrates DTV, DBV, DXV and ANV wereall detected by RT-PCRin
`
`non-contaminated S2 cells 4 days after inoculation with the supernatant of the S2-
`
`GMRcells; Figure 7B illustrates detection of a DI-RNA derived from ANV RNA2 in
`
`S2-GMRcells (right lane) and in S2 after inoculation with the supernatant of S2-
`
`GMRcells (left lane) by Northern blot hybridizations using a probe recognizing the
`
`3’-terminal 120 nt of RNA2; Figure 7C illustrates structure of the cloned DI-RNA of
`
`ANV(top) and mapping of the perfect-matched 21-nt siRNAs sequenced from S2-
`
`GMRcells to the positive (blue) and negative (red) strands of ANV RNA2 (20-nt
`
`windows) (bottom) ; as discussed in detail in Example 2, below.
`
`Figure 8 illustrates size distribution (Figure 8A) and aggregate nucleotide
`
`composition (Figure 8B) of virus-derived small RNAs in Drosophila OSS cells; as
`
`discussed in detail in Example 2, below.
`
`Like reference symbols in the various drawingsindicate like elements.
`
`DETAILED DESCRIPTION
`
`In alternative embodiments the invention provides methods for viral genome
`
`assembly andviral discovery using small inhibitory RNAs,or “small silencing,”
`
`RNAs(siRNAS), micro-RNAs (miRNAs) and/or PIW]-interacting RNAs (piRNAs),
`
`including siRNAS, miRNAsand/or piRNAsisolated or sequenced from invertebrate
`
`organismssuchas insects (Anthropoda), nematodes (Nemapoda), Mollusca, Porifera,
`
`and other invertebrates, and/or plants, fungi or algae, Cyanobacteria and the like.
`
`As described in Example 2, we found that viral small silencing RNAs
`
`produced by invertebrate animals are overlapping in sequence and can assemble into
`
`long contiguous fragments of the invading viral genome from small RNAlibraries
`
`sequenced by next generation platforms. Based on this finding, we developed an
`
`approach of virus discovery in invertebrates by deep sequencing and assembly oftotal
`
`small RNAs (vdSAR)isolated from a host organism ofinterest.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`As described in Example 2, alternative embodiments of the invention revealed
`
`mix infection of Drosophila cell lines and adult mosquitoes by multiple RNA viruses,
`
`five of which were new. Analysis of small RNAs from mix infected Drosophila cells
`
`showed that infection ofall three distinct dsRNA virusestriggered production of viral
`
`siRNAswith features similar to siRNAs derived from +RNA viruses. Ourstudy also
`
`revealed production and assembly of virus-derived piRNAsin Drosophila cells,
`
`suggesting a novel function of piRNAsin viral immunity. Thus, unique features of
`
`the invention’s vdSAR can discover new invertebrate and arthropod-borne animal and
`
`human viral pathogens.
`
`As used herein and in the appended claims, the singular forms "a,” "and,” and
`
`"the" include plural referents unless the context clearly dictates otherwise. Thus, for
`
`example, reference to "an siRNA"includes a plurality of such siRNAsand reference
`
`to "the virus" includes reference to one or moreviruses, and so forth.
`
`Unless defined otherwise, all technical and scientific terms used herein have
`
`the same meaning as commonly understood to one of ordinary skill in the art to which
`
`this disclosure belongs. Although any methods and reagents similar or equivalent to
`
`those described herein can be usedin the practice of the disclosed methods and
`
`compositions, the exemplary methods and materials are now described.
`
`Also, the use of“or means “and/or” unless stated otherwise. Similarly,
`99 66
`99 66
`99 Coe
`99 cee
`“include,”
`“includes,” and “including” are
`
`“comprise,”
`
`“comprises,
`
`comprising”
`
`10
`
`15
`
`20
`
`interchangeable and not intendedto be limiting.
`
`It is to be further understood that where descriptions of various embodiments
`
`use the term “comprising,” those skilled in the art would understand that in some
`
`specific instances, an embodiment can bealternatively described using language
`
`25
`
`“consisting essentially of’ or “consisting of.”
`
`The disclosure of U.S. Patent No. 7,211,390, describing techniques associated
`
`with “deep sequencing” is incorporated herein by reference.
`
`In immunity, viral infection induces production of virus-derived small
`
`interfering RNAs (siRNAs), pi-RNAs and miRNAsthat subsequently guide specific
`
`30
`
`viral RNA clearance by the RNAinterference (RNAi), pi-RNA and miRNA
`
`mechanism. In D. melanogaster, for example, siRNAs of 21 nucleotides long
`
`targeting several positive-strand (+) RNAviruses are produced by Dicer-2 from
`
`processing dsRNAreplicative intermediates synthesized during viral RNA
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`replication. Assisted by the dsRNA-binding protein R2D2, these viral siRNAs are
`
`then loaded in Argonaute-2 to direct viral RNA clearance (Galiana-Arnouxetal.,
`
`2006; Wanget al., 2006; Zambonetal., 2006). As a counter defense, viruses encode
`
`essential pathogenesis proteins that are viral suppressors of RNAi (VSRs) (Li and
`
`Ding, 2006; Mlotshwaet al., 2008). VSRs mayinhibit either the production or
`
`activity of viral siRNAsby targeting the dsRNA precursors, siRNAs or Argonaute
`
`proteins. Several nuclear-replicating DNA viruses produce virus-derived microRNAs
`
`following infection of their mammalian host cells and many proteins encoded by
`
`mammalian RNA and DNAviruses exhibit VSR activity. However, the current
`
`consensusis that in vertebrates viral dsRNAtriggers PKR and interferon responses
`
`instead of the RNAi response.
`
`The disclosure provides a method for virus discovery that is independent of
`
`either amplification or purification of viral particles. Many of the human diseases
`
`such as approximately half of all analyzed cases of human encephalitis and
`
`gastroenteritis, have no identified etiology. Thus, discovery of new virusesor the
`
`identification of the presence of viral infection can facilitate identification of human
`
`pathogenic viruses, improve our understanding of their transmission and provide
`
`diagnostic tools and targets for the developmentofanti-virals.
`
`Thedisclosure is based, in part, on the understanding of the mechanism of
`
`RNAi-mediated, including pi-RNA-, miRNA- and siRNA-based, viral immunity. In
`
`this immunity, viral infection induces production of virus-derived small interfering
`
`RNAs(siRNAs), pi-RNAs and miRNAs, that subsequently guide specific viral RNA
`
`clearance by the RNAinterference (RNAi) (pi-RNA-, miRNA- and siRNA-based)
`
`mechanism.In D. melanogaster, for example, siRNAs of 21 nucleotides long
`
`targeting several positive-strand (+) RNAvirusesare produced by Dicer-2 from
`
`processing dsRNA replicative intermediates synthesized during viral RNA
`
`replication. Assisted by the dsRNA-binding protein R2D2, these viral siRNAs are
`
`then loaded in Argonaute-2 to direct viral RNA clearance (Galiana-Arnouxetal.,
`
`2006; Wanget al., 2006; Zambonet al., 2006). As a counter defense, viruses encode
`
`essential pathogenesis proteins that are viral suppressors of RNAi (VSRs)(Li and
`
`Ding, 2006; Mlotshwaet al., 2008). VSRs mayinhibit either the production or
`
`activity of viral siRNAs, pi-RNAs and miRNAsbytargeting the dsRNA precursors,
`
`siRNAs, pi-RNAs and miRNAsor Argonaute proteins. Several nuclear-replicating
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`DNAviruses produce virus-derived microRNAs following infection of their
`
`mammalian host cells and many proteins encoded by mammalian RNA and DNA
`
`viruses exhibit VSR activity (Ding and Voinnet, 2007). However, the current
`
`consensusis that in vertebrates viral dsRNA triggers PKR and interferon responses
`
`instead of the RNAi (siRNAs, pi-RNAs and miRNAs) response.
`
`The disclosure demonstrates by the next generation sequencing technologies
`
`viral siRNAs, pi-RNAs and miRNAsproducedin plants andfruit flies infected with
`
`positive-strand RNA viruses, which are closely related to human pathogenic viruses
`
`such as poliovirus and West Nile virus. The results show that viral siRNAs produced
`
`by the host immune system in responseto viral infection are overlapping in sequence
`
`and can be assembled back into long continuous fragments (contigs) of the infecting
`
`viral RNA genomeusing assembly programs developed for short read genome
`
`sequencing. Unlike individual siRNAs, pi-RNAs and miRNAs, the contigs assembled
`
`from viral siRNAs, pi-RNAs and miRNAscanbetranslated into protein sequences in
`
`silico for homology searches to identify new viruses that may be only distantly related
`
`to known viruses.
`
`The disclosure demonstrates that deep sequencing by the next generation
`
`technologies and assembly of virus-derived siRNAs, pi-RNAs and miRNAs can be
`
`employed as a new approach for virus discovery and identification. Indeed, the
`
`examination of a recently sequenced small RNAlibrary (Flyntet al., 2009) made
`
`from a Drosophila cell line found that the cell line is infected with at least five distinct
`
`RNAviruses. These include two knownviruses and three new viruses belonging to
`
`different genera not previously described. Since virus infection of plants and
`
`invertebrates inevitably results in the production of virus-derived siRNAs, pi-RNAs
`
`and miRNAs,this invention does not depend on the ability to either amplify the virus
`
`or purify the viral particle to enrich viral nucleic acids, which is essential for the
`
`current technologies. Importantly, any viruses detected by the methodofthe
`
`disclosure are live and replication-competent because viral siRNAs, pi-RNAs and
`
`miRNAsare products of an active host immune response to viral infection.
`
`The observation that individual viral siRNAs, pi-RNAs and miRNAscan be
`
`assembled back to longer genome fragments of the invading virus provides an
`
`exciting new methodforvirus discovery by deep sequencing and assemblyofviral
`
`siRNAs, pi-RNAs and miRNAs. Unlike individual siRNAs, pi-RNAs and miRNAs,
`
`10
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`the contigs assembled from viral siRNAs, pi-RNAs and miRNAs can betranslated
`
`into protein sequencesin silico for homology searches to identify new viruses that
`
`may bedistantly related to knownviruses.
`
`The disclosure provides a frame of the VDsiR comprised of bioinformatics
`
`analysis and experimental verification. Small RNA assembling is a useful component
`
`of the system, the numberof input sequences and distinct programs have impact on
`
`the output. In a pilot study (described herein), Velvet was foundto be a useful
`
`program for the project, which employs the principle of de Bruijn graphsto build up
`
`continuous sequence from short reads in short run time (Zerbinoet al, 2008).
`
`The disclosure thus provides in one embodiment, a method comprising (1)
`
`obtaining nucleotide sequences from a small RNAlibraries comprising a plurality of
`
`naturally occurring 18-28 nucleotide RNA fragments to obtain a sequenced small
`
`RNAlibrary; (ii) assembling the sequencesin the sequenced small RNAlibrary into
`
`at least one contiguous sequence comprising a plurality of nucleotide RNA fragment
`
`sequences; optionally filling in gaps in a sequence by RT-PCRtechniques; (iii)
`
`searching a database of viral sequence usingthe at least one contiguous sequence to
`
`identify a viral sequence having at least 50%-100 percent homology to the contiguous
`
`sequence;(iv) identifying and annotating the phylogenetic analysis of the identified
`
`viral sequence with the contiguous sequence.
`
`It will be understood that a sequence library may be providedbya third party
`
`or made available to a user by any number of ways(i.e., internet, computer readable
`
`medium andthelike) and thus the process described above can be adapted to carry the
`
`process andidentify or annotate a virus accordingly. In some embodiment, however,
`
`the library may be a sample library comprising substantially purified RNA from an
`
`organism of interest. In such instances, deep sequencing techniquesare carried out
`
`and a sequence library created. In yet another embodiment, a sample comprising
`
`substantially purifying small RNA fragments from an organism ofinterest are
`
`provided in which case sequencing the RNA fragments to obtain the small RNA
`
`library is performed.
`
`In yet another embodiment, if a gross RNA sample from an organism is
`
`provided or where increased homology searchingis desired, the method may
`
`optionally include removing sequenced segments from the library that overlap with
`
`the genomic sequence of the organism of interest from which the RNA wasderived.
`
`11
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`In yet another embodiment, the method further comprises completing a
`
`genomic sequence of a virus comprising the contiguous sequence using 5’-RACE and
`
`3’-RACE.
`
`For example, the disclosure demonstrates in the specific embodiments and
`
`proof of principle that a method includingthe steps of construction of a small RNA
`
`library from cell culture or adults insects such as mosquitoesor fruit flies; deep
`
`sequencing of the small RNAlibraries with an Hfimina 2G Analyser; assembly of the
`
`sequenced small RNAs by Velvet using either all of the sequenced small RNAsof 18-
`
`28 nucleotides in length or small RNAsof specific lengths such as 21-nt and 22-nt,
`
`which most likely represent the products of Drosophila Dicer-2 and Dicer-1,
`
`respectively, to generate a contig(s), contigs of virus-derived siRNAs mayinclude
`
`features such as specific enrichment of 21- to 22-nt small RNAs, the presence of
`
`small RNAsof both polarities and the high density of siRNAs (number of
`
`siRNAs/length of contigs); identification and removal of those assembled sequences
`
`mapped onto the host genome when the genome sequence is known, which reduces
`
`the numberof the candidates for next steps; homology search of contigs with known
`
`virus at both the nucleotide and protein levels; in an optional embodiment, RT-PCR
`
`and sequencing can be usedto fill the gaps between the contigs that show limited
`
`similarities with a known virus; optionally completing the full-length genomic
`
`sequence ofthe identified virus with 5’-RACEand 3’-RACE; and annotation and
`
`phylogenetic analysis of the identified virus, resulted in the identification of 2 known
`
`viruses and 3 novel viruses from a D. melanogaster sample.
`
`Asused herein a sample is any sample that can contain a virus. Thus, the
`
`sample can be obtained from the environment, from a specific organism (including
`
`plants, insects and mammals). An environmental sample can be obtained from any
`
`number of sources (as described above), including, for example, insect feces, hot
`
`springs, soil and the like. Any source of nucleic acids in purified or non-purified form
`
`can be utilized as starting material. Thus, the nucleic acids may be obtained from any
`
`source which is contaminated by an infectious organism (e.g. a virus). The sample can
`
`be an extract from any bodily sample such as blood, urine, spinal fluid, tissue, vaginal
`
`swab, stool, amniotic fluid or buccal mouthwash from any mammalian organism. For
`
`non-mammalian(e.g., invertebrates) organisms the sample can be a tissue sample,
`
`salivary sample, fecal material or material in the digestive tract of the organism. For
`
`12
`
`10
`
`15
`
`20
`
`25
`
`30
`
`
`
`WO 2010/141433
`
`PCT/US2010/036849
`
`example, in horticulture and agricultural testing the sample can be a plant, soil, liquid
`
`or other horticultural or agricultural product; in food testing the sample can be fresh
`
`food or processed food (for example infant formula, seafood, fresh produce and
`
`packaged food); and in environmental testing the sample can be liquid, soil, sewage
`
`treatment, sludge and any other sample in the environment.
`
`The sample can be processed using techniques knownin the art for deep
`
`sequencing. In some embodiments, the sample is treated with an RNaseinhibitor to
`
`prevent degradation of RNA oligonucleotides in t

Accessing this document will incur an additional charge of $.
After purchase, you can access this document again without charge.
Accept $ ChargeStill Working On It
This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.
Give it another minute or two to complete, and then try the refresh button.
A few More Minutes ... Still Working
It can take up to 5 minutes for us to download a document if the court servers are running slowly.
Thank you for your continued patience.

This document could not be displayed.
We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.
You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.
Set your membership
status to view this document.
With a Docket Alarm membership, you'll
get a whole lot more, including:
- Up-to-date information for this case.
- Email alerts whenever there is an update.
- Full text search for other cases.
- Get email alerts whenever a new case matches your search.

One Moment Please
The filing “” is large (MB) and is being downloaded.
Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!
If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document
We are unable to display this document, it may be under a court ordered seal.
If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.
Access Government Site