throbber
Gioia et al. BMC Genomics (2018) 19:334
`https://doi.org/10.1186/s12864-018-4718-6
`
`RESEARCH ARTICLE
`
`Open Access
`
`A genome-wide survey of mutations in
`the Jurkat cell line
`, Azeem Siddique2, Steven R. Head2, Daniel R. Salomon1ˆ and Andrew I. Su1
`Louis Gioia1*
`
`Abstract
`Background: The Jurkat cell line has an extensive history as a model of T cell signaling. But at the turn of the 21st
`century, some expression irregularities were observed, raising doubts about how closely the cell line paralleled
`normal human T cells. While numerous expression deficiencies have been described in Jurkat, genetic explanations
`have only been provided for a handful of defects.
`Results: Here, we report a comprehensive catolog of genomic variation in the Jurkat cell line based on
`whole-genome sequencing. With this list of all detectable, non-reference sequences, we prioritize potentially
`damaging mutations by mining public databases for functional effects. We confirm documented mutations in Jurkat
`and propose links from detrimental gene variants to observed expression abnormalities in the cell line.
`Conclusions: The Jurkat cell line harbors many mutations that are associated with cancer and contribute to Jurkat’s
`unique characteristics. Genes with damaging mutations in the Jurkat cell line are involved in T-cell receptor signaling
`(PTEN, INPP5D, CTLA4, and SYK), maintenance of genome stability (TP53, BAX, and MSH2), and O-linked glycosylation
`(C1GALT1C1). This work ties together decades of molecular experiments and serves as a resource that will streamline
`both the interpretation of past research and the design of future Jurkat studies.
`Keywords: Jurkat, Whole-genome sequencing, Cancer, T-cell, Genome stability, T-cell receptor, T-cell acute
`lymphoblastic leukemia
`
`Background
`The Jurkat cell line was isolated in 1977 from the blood
`of a fourteen-year-old boy with Acute Lymphoblastic
`Leukemia [1]. It was one of the first in vitro systems for
`studying T-cell biology and helped to produce an incredi-
`ble number of discoveries and publications (Fig. 1) [2].
`As the workhorse behind a diverse array of molecular
`investigations, the Jurkat cell line revealed the founda-
`tions for our modern understanding of multiple signaling
`pathways. Most notably, studies of Jurkat cells established
`the bulk of what is currently known about T-cell recep-
`tor (TCR) signaling [2]. However, at the turn of the 21st
`century, as the use of Jurkat as a model T-cell line was
`reaching its height, some abnormalities in the cell line
`began to come to light.
`
`*Correspondence: lhgioia@scripps.edu
`ˆDeceased
`1Department of Molecular Medicine, The Scripps Research Institute, La Jolla,
`California 92037, USA
`Full list of author information is available at the end of the article
`
`Problems were first noticed in the form of gene expres-
`sion defects. The most publicized of these defects was
`aberrant PI3K signaling due to the absence of PTEN and
`INPP5D (SHIP) in Jurkat cells [2]. The loss of these two
`central regulators of phosphatidylinositol signaling was
`proposed as the cause of the previously-documented, con-
`stitutive activation of PI3K signaling, a major mediator
`of downstream TCR signaling events [3]. This fundamen-
`tal TCR signaling defect in Jurkat led many researchers
`to question its validity as a model system for T-cell stud-
`ies [2]. Although the number of publications using Jurkat
`dropped off over the following decade, it is still widely
`used in biomedical research (Fig. 1).
`Defect detection up to now has been primarily based
`on top-down approaches, requiring knowledge of signal-
`ing or expression defects, which leads to interrogations of
`specific coding sequences. While multiple genetic defects
`have been described over the past few decades, these top-
`down approaches are limited in scope and have failed to
`provide a broader understanding of Jurkat biology.
`
`© The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
`International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
`reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the
`Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
`(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 1
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 2 of 13
`
`Fig. 1 Jurkat publication trends. Yearly publication counts for PubMed queries. Representative queries are given in the legend. Note that these
`query descriptions are abbreviations of more detailed search terms, which are provided in the “Methods” section
`
`Modern sequencing technology allows for interrogation
`of the entire genome. In contrast to top-down techniques,
`whole-genome sequencing (WGS) allows us to investigate
`genetic defects from the bottom up, with the potential
`to extend our understanding of abnormalities in Jurkat.
`Thus, in this study, we used shotgun sequencing to per-
`form genome-scale characterization of genomic variants
`in this commonly-used cell line.
`
`Results
`Sequencing and variant callers
`Whole-genome sequencing of the Jurkat cell line pro-
`duced over 366 million 100bp paired-end reads and over
`531 million 150bp paired-end reads, totaling over 116 bil-
`lion sequenced bases. More than 98% of the reads were
`successfully aligned to the hg19 human reference genome
`with the Burrows-Wheeler Aligner [4], totaling over 110
`billion aligned bases. This gave an average coverage of
`∼ 36x across the hg19 reference sequence, with over 10x
`depth of coverage for 78.8% of the genome. The aligned
`reads were then used to detect both small and large
`genomic variants in the Jurkat genome.
`In order to utilize all of the information available in the
`WGS data, we employed a suite of variant calling tools
`for the identification of all major types of genomic vari-
`ants. Each tool uses a certain type of sequence information
`to identify specific categories of variants. Our variant
`caller suite consisted of four distinct tools and algorithms:
`The Genome Analysis Toolkit, Pindel, BreakDancer, and
`CNVnator [5–8].
`The Genome Analysis Toolkit (GATK) from the Broad
`Institute uses De Bruijn graph-based models to iden-
`tify single-nucleotide substitutions and small insertions
`and deletions. Pindel’s split-read approach can also detect
`
`small insertions and deletions, as well as inversions, tan-
`dem duplications, and inter-chromosomal translocations.
`BreakDancer compares the distance between aligned read
`pairs to the insert size distribution from the sequencing
`library in order to find large structural variants. CNVnator
`uses read-depth information and a mean-shift algorithm
`to assign copy number levels across the genome and
`identify deletion and duplication events.
`In order for GATK to call small variants, it must be
`told how many alleles to expect at each position. As such,
`an accurate estimate of Jurkat ploidy is required before
`GATK can be used. While both the original 1977 publica-
`tion and the American Type Culture Collection (ATCC)
`report that Jurkat is diploid, other publications refute this
`description. The first karyotypes of the Jurkat cell line
`were published by Snow and Judd in 1987, who found
`that Jurkat was hypotetraploid, possessing fewer than four
`times the haploid number of chromosomes [9]. A few
`years later, tetraploidy was corroborated by an investiga-
`tion of p53 mutations, which found that the Jurkat cell line
`contained 4 separate p53 alleles [10]. More recent reports
`confirm Jurkat tetraploidy. The German Collection of
`Microorganisms and Cell Cultures (DSMZ) describes the
`Jurkat karyotype as a "human flat-moded hypotetraploid
`karyotype with 7.8% polyploidy." In addition, a multicolor-
`Fluorescence In Situ Hybridization study from 2013 found
`within-culture mosaicism on a tetraploid background
`[11].
`
`Variant calls
`Given the previous reports of tetraploidy, we ran GATK
`with a ploidy count of 4. GATK identified nearly 5 mil-
`lion variants, comprising ∼ 3.5 million single-nucleotide
`substitutions, ∼ 1.0 million small deletions, and ∼ 357
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 2
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 3 of 13
`
`thousand small insertions, across over 4.6 million variant
`loci. Basic metrics for the GATK variant calls are consis-
`tent with normal human samples. The ratio of homozy-
`gous to heterozygous variant loci is 0.635, which is in
`the range of previously reported ratios, and the ratio of
`transitions to transversions is 2.10, which is the expected
`value for human genomes [12]. The number of single-
`nucleotide substitutions is similar to previously reported
`values. However, the total number of indels is higher
`than published values from human WGS studies, which
`generally detect fewer than 700 thousand indels by shot-
`gun sequencing [12]. To date, the highest number of
`indels identified in a single human genome was ∼ 850
`thousand—determined via Sanger sequencing of J.C. Ven-
`ter’s genome [13]. This enrichment for indels, especially
`deletions, in Jurkat is likely to be at least partially due to
`the redundancy of the tetraploid genome.
`The Pindel variant caller detected 1.4 million deletions,
`740 thousand insertions, 18 thousand duplications, 150
`thousand inversions, and 4 inter-chromosomal translo-
`cations. The split-read approach is markedly similar to
`GATK’s method for the detection of small insertions and
`deletions. GATK also uses split-reads, but its detection of
`variants relies on an assembly-based method that is lim-
`ited to small sequence differences between the reads and
`the reference genome. Accordingly, the small indels called
`by both methods should be similar. As expected, in the
`Jurkat call set, over 85% of the deletions and over 65% of
`the insertions that were identified by GATK have direct
`matches in the Pindel calls.
`BreakDancer identified 6128 deletions, 18 insertions,
`183 inversions, 1981 intra-chromosomal translocations,
`and 113 inter-chromosomal translocations.
`CNV calls from CNVnator are presented in Fig. 2
`by percentage of the genome. A plot of the raw read
`
`depth density is provided in Additional file 1: Figure S1.
`CNVnator reported a modal copy number of 4 in Jurkat,
`representing over 65% of the genome and corroborat-
`ing reports of tetraploidy. From the CNVnator results,
`we identified 2499 deletion sites (CN ≤ 1), of which
`218 were homozygous (CN = 0), and 1863 duplication
`sites (CN ≥ 5).
`The structural variant calls from each tool were com-
`pared and merged with specific considerations made for
`each category of variant and each detection tool (see
`“Methods” section). Short and long insertions and dele-
`tions were defined using a cutoff of 50 bp, in accordance
`with the structural variant databases from NCBI [14].
`The numbers of variants called by each tool, along with
`the proportion of overlapping loci and total number of
`merged calls, are provided in Table 1.
`Most types of variants were called by multiple tools.
`However, the number of variants called by each tool and
`the number of variant calls that were unique to each tool
`varied greatly between variant classes and individual vari-
`ant callers (Table 1). Furthermore, each tool differed in the
`sizes of variants that it called (Additional file 1: Figures
`S2-S8).
`The relative contributions of each variant caller to the
`total set of merged calls are displayed in Fig. 3. Pindel calls
`dominated the merged variant sets, with the exception
`of translocations. This unmatched number of Pindel calls
`can be attributed to the power of the split-read approach.
`On the other hand, Pindel calls are limited in their utility
`due to the tool’s inability to determine allele frequencies.
`In contrast to Pindel’s detection power and lack of allele
`annotations, GATK and CNVnator are both limited in
`the range of variant sizes that they can detect but are
`able to consider all alleles. Therefore, while Pindel calls
`make up the majority of detected variants, GATK and
`
`Fig. 2 Histogram of DNA copy number in Jurkat. Binned copy number alterations as fractions of the genome
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 3
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 4 of 13
`
`Table 1 Variant loci counts from each tool
`GATK
`
`Pindel
`
`Breakdancer
`
`CNVnator
`
`3,520,988
`
`170,397
`
`841,001
`
`(70%)
`
`326,446
`
`(65%)
`
`326
`
`(0%)
`
`1904
`
`(61%)
`
`1039
`
`(31%)
`
`Substitutions
`
`Short Hom. Deletion
`
`Short Deletion
`
`Short Insertion
`
`Long Hom. Deletion
`
`Long Deletion
`
`Long Insertion
`
`Duplication
`
`Inversion
`
`Intra. Translocation
`
`Inter. Translocation
`
`1,239,299
`
`(47%)
`
`616,298
`
`(35%)
`
`118,610
`
`(1.4%)
`
`125,918
`
`(0.25%)
`
`17,762
`
`(22%)
`
`149,545
`
`(0.0087%)
`
`4
`
`(0%)
`
`47
`
`(0%)
`
`6081
`
`(10%)
`
`18
`
`(0%)
`
`183
`
`(7.1%)
`
`1981
`
`113
`
`(0%)
`
`108
`
`(0%)
`
`2499
`
`(1.2%)
`
`1863
`
`(24%)
`
`Merged
`
`3,520,988
`
`170,397
`
`1,460,321
`
`729,727
`
`434
`
`125,397
`
`126,657
`
`15,288
`
`149,715
`
`1981
`
`117
`
`The percentage of sites that overlap the other tools is provided where applicable
`
`CNVnator calls were prioritized in our investigations of
`variant consequence.
`
`Comparisons to databases
`After creating the merged variant sets, we compared
`them to databases of previously identified variants in
`
`order to assess the novelty of the genomic variants that
`were detected in Jurkat. We used dbSNP and DGV as
`resources for known short and long variants, respectively
`[15, 16]. Both of these databases contain the variants that
`were identified by the 1000Genomes project in addition
`to variants cataloged by other sources. Comparisons of
`
`Fig. 3 Comparison of variant loci counts from each tool. a Total number of merged variant loci called by all tools for different variant types.
`b Fraction of merged variant loci called by each tool for different variant types
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 4
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 5 of 13
`
`short variants—including single-nucleotide substitutions,
`short deletions, and short insertions—to variants found
`in the 1000Genomes project and dbSNP are given in
`Fig. 4. Single-nucleotide substitutions showed the great-
`est number of matches, while fewer than half of the short
`insertions and deletions were found in dbSNP. An even
`greater reduction in the number of database matches
`was seen in the long structural variant database compar-
`isons (Fig. 4). The differences in the number of database
`matches between single-nucleotide variants, short indels,
`and long structural variants are likely due to several fac-
`tors. The feasibility of structural variant detection, com-
`bined with the paucity of studies investigating these larger
`variants, are major contributors to these differences, but
`the increased mutational sample space of larger variants
`may also play a role.
`We also compared our SNV and small indel calls to
`those found in Jurkat by the COSMIC Cell Line project
`[17]. Our WGS approach identified nearly 10x as many
`SNVs as were detected by COSMIC via microarray. How-
`ever, of the ∼ 408 thousand Jurkat SNVs in COSMIC,
`we uncovered over 383 thousand (94%) matching single-
`nucleotide variants. Within the matching SNV calls, geno-
`types between the two call sets agreed at over 97% of loci.
`The same level of agreement was observed for both the
`∼ 174 thousand homozygous COSMIC calls and the ∼
`210 thousand heterozygous COSMIC calls. Deletion and
`insertion calls showed less overlap, but we were able to
`find 67% of the 18 thousand COSMIC deletion calls and
`40% of the 2260 COSMIC insertion calls in our data.
`Our final comparison to previously identified variants
`focused on rare, pathogenic variants from the ClinVar
`database. After removing records without assertion cri-
`teria, corresponding to a review status of zero stars, 10
`Jurkat variants were reported as pathogenic by ClinVar
`
`(Table 2). Interestingly, 6 of the 10 variants, involving 5
`separate genes, are thought to cause cancer. The other
`pathogenic ClinVar matches are associated with severe
`developmental defects. Long deletions and duplications
`from Jurkat were also found in ClinVar, but the annota-
`tions do not contain gene information and are generally
`less informative (Additional files 2 and 3).
`Moving from established to predicted effects, we
`used SnpEff
`to predict
`the functional consequences
`of the GATK-called small variants. SnpEff
`identified
`9997 synonymous and 10,984 nonsynonymous muta-
`tions. Among the nonsynonymous mutations, 252 vari-
`ants are nonsense mutations and 10,732 variants are
`missense mutations.
`‘High Impact’
`functional effects
`were predicted for 1141 of the small variant loci, of
`which 747 variants were determined to be rare (MAF
`< 0.001) in the Exome Aggregation Consortium (ExAC)
`dataset of over 60 thousand human samples [18]. These
`rare, high-impact variants were predicted to affect
`678 genes.
`A second set of ‘High Impact’ variants was created
`from the homozygous deletion calls that intersected cod-
`ing exons. This high-impact, homozygous deletion set
`includes 120 variant loci across 129 genes.
`All sets of variants, including those of high impact,
`appear to be distributed across the genome (Fig. 5). How-
`ever, even if the mutations are randomly distributed, it
`is still possible that some biological processes are more
`affected than others. The two sets of highly impacted
`genes were combined, producing a set of 781 unique
`genes. This list of likely damaged genes was used to probe
`selected gene set databases from MSigDB [19]. The top 5
`enriched gene sets are displayed in Table 3.
`As might be expected from a cancer cell line, the dam-
`aged genes in Jurkat are involved in genome, cell cycle, and
`
`Fig. 4 Jurkat variants with database matches. Jurkat variants loci that have matches in dbSNP (short variants) and DGV (long variants) as percentage
`of total Jurkat variant sites for each type of variant. Number of databases matches over the number of Jurkat variant loci: 3.29M / 3.52M substitutions;
`652K / 1.46M short deletions; 323K / 730K short insertions; 6.38K / 125K long deletions; 286 / 127K long insertions; 1.27K / 15.3K duplications
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 5
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 6 of 13
`
`Table 2 Jurkat variants found in the ClinVar database
`rsID
`Jurkat AF
`
`ClinVar substitutions
`
`rs63750636
`
`rs397517342
`
`rs397516435
`
`ClinVar short deletions
`
`rs63750075
`
`rs398122841
`
`rs397508104
`
`rs750664956
`
`rs786204835
`
`ClinVar short insertions
`
`rs397507178
`
`rs398122840
`
`1.0
`
`0.75
`
`0.25
`
`—
`
`—
`
`—
`
`—
`
`—
`
`—
`
`—
`
`Gene
`
`MSH2
`
`CDH23
`
`TP53
`
`MSH6
`
`BAX
`
`KCNQ1-(AS1)
`
`ASPM
`
`PURA
`
`RAD50
`
`BAX
`
`Phenotype
`
`ClinVar accession
`
`Lynch syndrome
`
`Usher syndrome type 1D
`
`Li-Fraumeni syndrome
`
`Lynch syndrome
`
`Carcinoma of colon
`
`Long QT syndrome
`
`Not provided
`
`Not provided
`
`Hereditary cancer
`
`Carcinoma of colon
`
`RCV000076405.3
`
`RCV000039224.2
`
`RCV000205265.3
`
`RCV000074711.2
`
`RCV000010120.5
`
`RCV000046039.3
`
`RCV000217980.1
`
`RCV000169739.5
`
`RCV000030958.3
`
`RCV000010119.5
`
`cytoskeleton maintenance, as well as sugar processing.
`The enrichment of damaged genes that are involved in the
`immune system is particularly interesting given the Jurkat
`cell line’s role in establishing our current understanding of
`T-cell immune responses.
`
`While the gene set enrichment analysis aided in
`categorizing the many genetic aberrations in the Jurkat
`cell line, most of the top-enriched sets are broad, sug-
`gesting gross defects across general biological processes.
`These findings reinforce the growing body of literature
`
`Fig. 5 Genomic variation distributions. Distributions of multiple types of variants across the Jurkat genome. Plotted data listed from outside-in: 1.
`hg19 genome ideogram (gray); 2. Density of SnpEff “High Impact” SNVs with rare ExAC allele frequencies (gold); 3. Homozygous deletions that lie in
`coding exons (red); 4. Deletions longer than 25 kb (blue); 5. Insertions longer than 50 bp that lie in coding exons (green); 6. Inversions longer than
`25 kb (cyan); 7. Interchromosomal translocations (center)
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 6
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 7 of 13
`
`Table 3 Gene sets enriched for highly impacted genes
`GO: CHROMOSOME ORGANIZATION
`Overlap: 51/1009
`p-value= 0.00011
`Genes: APBB1, ATXN3, ATXN7, BAZ2A, BCORL1, BRD8, CDC14A, CDCA5,
`CDYL, CENPT, CLASP1,
`CREBBP, EHMT1, GATAD2B, GTF2H3, HDAC4, KAT2A, KDM5B, KIF23, KNTC1,
`MLH3, MSH2,
`NCAPD2, NDC80, NOC2L, PIBF1, PRIM2, PTGES3, RAD50, RBL2, RSF1, SETX,
`SLX4, SMARCC2,
`SMC3, TCF7L2, TEP1, TET1, TEX14, TOP1MT, TP53, TTN, USP15, VPRBP,
`YEATS2, ZNF304, ZNF462
`GO: CELL CYCLE
`p-value= 0.00015
`Overlap: 62/1316
`Genes: ADCY3, ANAPC5, APBB1, ARHGEF2, BAX, CDC14A, CDC27, CDCA5,
`CDK14, CECR2,
`CENPT, CEP164, CKAP2, CLASP1, DYNC1H1, DYNC1I2, FANCI, HINFP,
`HSP90AA1, INTS3,
`IQGAP3, KIAA0430, KIAA1377, KIF23, KNTC1, KRT18, MACF1, MAP3K8, MAP9,
`MCM8,
`MKI67, MLH3, MNS1, MSH2, NCAPD2, NDC80, NUP214, NUP98, OFD1, ORC1,
`PHLDA1,
`PIBF1, PRIM2, PSMD3, PTEN, PYHIN1, RAD50, RBL2, RUVBL1, SMC3, SON,
`TCF7L2, TEX14,
`THAP1, TP53, TP53BP1, TPR, TRIOBP, TSC1, TTK, TTN, ZFHX3
`GO: CARBOHYDRATE DERIVATIVE BIOSYNTHETIC PROCESS
`Overlap: 34/595
`p-value= 0.00017
`Genes: ADCY2, ADCY3, ADCY9, ALG1L2, ALG9, B3GALT1, B3GNT6, BCAN,
`BMPR2, C1GALT1C1,
`CANT1, CHST15, CHSY1, GAL3ST4, GPC6, GUCY2C, GXYLT1, HAS3, KIAA2018,
`MUC16, MUC19,
`MUC3A, MUC6, NDST4, OMD, PHLDA1, PIGS, PRKCSH, SLC25A13, ST3GAL3,
`ST3GAL5, TET1,
`UGCG, UGP2
`GO: NEGATIVE REGULATION OF ORGANELLE ORGANIZATION
`Overlap: 25/387
`p-value= 0.00020
`Genes: ARHGEF2, CDC14A, CKAP2, CLASP1, DYNC1H1, KIAA1377, KIF23,
`LIMA1, MAP9, MSH2,
`NDC80, NOC2L, OFD1, OTUB1, PIBF1, RAD50, SMC3, SPTA1, SPTAN1, TET1,
`TEX14, TPR, TRIOBP,
`TTK, UBQLN4
`GO: IMMUNE SYSTEM PROCESS
`Overlap: 85/1984
`p-value= 0.00023
`
`Genes: ABCB5, ADAM17, AGBL5, AIM2, AP3B1, APOB, ARHGEF2, BAX,
`C1orf177, C7, CCL13,
`
`CD14, CD177, CEACAM8, CLNK, CREBBP, CTLA4, CYFIP2, DEFB126, DHX58,
`DYNC1H1,
`
`DYNC1I2, DYNC2H1, ENDOU, ENPP3, F2, FN1, HDAC4, HLA-DRB5, HNRNPK,
`HSH2D,
`
`HSP90AA1, IGJ, IL10RB, IL27RA, IL2RG, ILF2, INPP5D, IPO7, ITGA6, KIF23, KIF3C,
`KIR2DS4,
`
`KLC2, LILRA3, MAP3K1, MAP3K8, MSH2, NCAM1, NLRC3, NLRC5, OAS1,
`OTUB1, PAPD4,
`
`PIBF1, PODXL, PRKACG, PSMD3, PTEN, RHOH, SAMHD1, SARM1, SEC31A,
`SECTM1, SHC1,
`
`SLC3A2, SLFN11, SPEF2, SPTA1, STAT5B, SYK, SYNCRIP, TAB2, TAPBP, TEK,
`TMIGD2, TNFSF4,
`
`TNK2, TRIL, TRIM10, TSC1, ULBP1, VPRBP, WDR7, WIPF1
`
`that has cataloged numerous irregularities in Jurkat biol-
`ogy, but they also imply that the deviation from normal
`T-cell biology may be more extensive than previous stud-
`ies had reported.
`
`Defective pathways
`By leveraging the deep history of the Jurkat cell line, in
`combination with our pathogenic and high-impact variant
`lists, we have distinguished three core pathways that are
`defective due to genomic aberrations in Jurkat—namely
`TCR signaling, genome stability, and O-linked glycosyla-
`tion. This analysis is not exhaustive. Rather, we focused
`on pathways that are well-supported by both the literature
`and our genomic analysis.
`
`TCR signaling
`The damaged genes affecting T-cell receptor signaling are
`PTEN, INPP5D, CTLA4, and SYK. TCR signaling in Jurkat
`was first called into question due to the lack of PTEN and
`INPP5D expression [3, 20]. Both PTEN and INPP5D are
`lipid phosphatases that regulate PI3K signaling by degrad-
`ing PtdIns(3,4,5)P3. PTEN mutations in Jurkat were first
`described by Sakai et al. in 1998. They found two sep-
`arate alterations in exon 7 "without normal conformers
`present," both of which introduced stop codons [21].
`We detected the same two heterozygous variants. SnpEff
`annotated one of these mutations as a frameshift vari-
`ant and the other as a stop-gained variant, predicting that
`both of these variants would result in loss of function.
`INPP5D (SHIP1) has long been known to not be
`expressed in the Jurkat cell
`line [3]. We have identi-
`fied a single-nucleotide substitution that changes codon
`317 from glutamine to a stop codon, as well as a
`47 bp heterozygous deletion from hg19.chr2:234068130–
`234068177. These same mutations were detected in 2009
`via targeted sequencing [22]. Admittedly, the lack of allele
`resolution in our data precludes us from making defini-
`tive claims about these mutations, as we cannot distin-
`guish which alleles were affected. Fortunately, the targeted
`sequencing study found the stop codon on one allele and
`the 47 bp deletion on the others, both of which should
`block the production of a full length INPP5D transcript.
`CTLA4 is a CD28 homolog that transmits an inhibitory
`signal to T cells. In 1993, Lindsten et al. noticed that
`“CTLA4 mRNA is not expressed nor induced in the Jurkat
`T cell line” [23]. However, the reason for this lack of
`CTLA4 induction has not been proposed. More recent
`investigations have detected both the protein and the tran-
`script, although the transcript was less abundant in Jurkat
`than in peripheral blood mononuclear cells [24]. This
`finding seems to support the hypothesis that the CTLA4
`protein is accumulated in the cytosol [24]. Our analyses
`revealed a heterozygous, stop-gained, single-nucleotide
`substitution that converts codon 20 to a stop codon.
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 7
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 8 of 13
`
`This mutation was found in around half of the mapped
`reads and might be responsible for the decreased CTLA4
`expression that has been observed in Jurkat cells, although
`other mechanisms may be at play.
`SYK is a member of the Syk family of non-receptor
`tyrosine kinases. It functions similarly to ZAP70 in trans-
`mitting signals from the T-cell receptor. In 1995, Fargnoli
`et al. reported that SYK is not expressed in the Jurkat
`cell line and contains a guanine insertion that causes a
`frameshift at codon 34. We identified the same heterozy-
`gous insertion in our sample, which is predicted to result
`in loss of function of the transcript, yet the mechanism
`behind the lack of expression of the other allele remains
`an open question.
`Interestingly, Fargnoli et al. proposed that the lack of
`SYK expression in Jurkat “may have facilitated the ini-
`tial identification and characterization of ZAP70 as the
`major ζ-associated protein” [25]. On the other hand, while
`the lack of SYK expression in Jurkat was subsequently
`confirmed, reconstitution studies suggest that SYK and
`ZAP70 occupy distinct roles in TCR signaling, with SYK
`displaying 100-fold greater kinase activity than ZAP70
`[26, 27].
`
`Genome stability
`TP53, BAX, and MSH2 encode tumor suppressors
`involved in maintaining genomic stability that are severely
`mutated in Jurkat. The product of the TP53 gene is
`p53, which is a known deficiency in the Jurkat cell line
`[20]. In 1990, Cheng and Haas detected a heterozygous,
`stop-gained single-nucleotide substitution in codon 196
`(R196*) in Jurkat cells. They proposed that this muta-
`tion “may play a role in the genesis or in the tumorigenic
`progression of leukemic T cells” [10]. We detected the
`same heterozygous mutation (rs397516435) in exon 6 of
`the TP53 gene and found that this mutation is associated
`with Li-Fraumeni syndrome [28], which is an autosomal
`dominant hereditary disorder that causes the early onset
`of tumors. This mutation is likely responsible for the
`consistent reports of p53 deficiencies in Jurkat cells.
`While loss of p53’s protective effects is normally thought
`of as the mechanism behind tumorigenesis, in some cases,
`truncated p53 can gain oncogenic functions [29]. Recent
`studies have revealed that stop-gained mutations in exon
`6 ofTP53 produce a truncated p53 isoform that seems
`to partially escape nonsense-mediated decay. These iso-
`forms, termed p53ψ, lack canonical p53 transcriptional
`activity. Instead, they localize to the mitochondria, where
`they activate a pro-tumorigenic cellular program by regu-
`lating mitochondrial transition pore permeability through
`interaction with cyclophilin D [30]. The Jurkat cell line’s
`expression of a p53ψ isoform may contribute to the
`previously-reported, exaggerated Ca2+ release upon TCR
`activation [31].
`
`the Bcl-2 gene family and
`BAX is a member of
`helps induce apoptosis. In the Jurkat cell line, BAX is
`not expressed due to the presence of two heterozy-
`gous frameshift mutations in codon 41 [32]. All alle-
`les are affected. We identified the same two variants,
`rs398122841 and rs398122840, each of which were found
`in approximately half of the mapped reads.
`Investigations into microsatellite instability revealed
`that MSH2 is not expressed in Jurkat due to a stop-gained
`point mutation in exon 13 [33]. We identified the same
`variant as a homozygous single-nucleotide substitution
`(rs63750636.) MSH2 is involved in DNA mismatch repair,
`and this stop-gained variant is associated with hereditary
`nonpolyposis colorectal cancer [34].
`
`O-linked glycosylation
`The Jurkat cell
`line’s inability to properly synthesize
`O-glycans, due to deficient core 1 synthase, glycoprotein-
`N-acetylgalactosamine
`3-beta-galactosyltransferase
`1
`(C1GALT1) activity, was first noticed in 1990 [35]. This
`deficiency causes Jurkat to express the Tn antigen, which
`is associated with cancer and other pathologies. In 2002,
`Ju and Cummings reported that a single-nucleotide
`deletion in COSMC (C1GALT1C1), a chaperone for
`C1GALT1, was responsible for Jurkat’s truncated O-
`glycans. The deletion causes a frameshift and introduces
`a stop codon in the only exon of the COSMC gene. Ju and
`Cummings assumed that the Jurkat cell line had retained
`its diploid, male genome and possessed only one copy of
`the X chromosome. We now know that the Jurkat cell line
`has two copies of the X chromosome, but consistent with
`the original report, we have determined through deep
`sequencing that the mutation is, indeed, homozygous
`across Jurkat’s two X chromosomes [36].
`
`Discussion
`We performed a bottom-up search for abnormalities in
`the Jurkat genome using short-read sequencing. We detect
`numerous examples of each examined variant type and
`use various strategies to tie these variants to functional
`effects. Our analysis identifies multiple dysfunctional
`pathways in the Jurkat cell line.
`While some of the variants were previously detected
`using top-down methods, we were able to add hundreds of
`potentially damaging variants to the list of Jurkat’s genetic
`defects. Gene set enrichment analysis revealed that many
`of the affected genes lie in pathways that are commonly
`defective in cancer. The great number of potentially dam-
`aging genes, combined with the large-scale genomic rear-
`rangements in Jurkat, make it difficult to pinpoint the
`cause of Jurkat’s biological abnormalities. However, some
`of the better-studied mutations, such as those reported as
`pathogenic by ClinVar, are likely to have significant effects
`on important signaling pathways.
`
`UPenn Ex. 2077
`Miltenyi v. UPenn
`IPR2022-00855
`Page 8
`
`

`

`Gioia et al. BMC Genomics (2018) 19:334
`
`Page 9 of 13
`
`In addition to these putatively damaging variants, we
`identified millions of mutations across all categories of
`genomic variants. The effects of these variants are less cer-
`tain, but our comprehensive variant catalog will facilitate
`further investigations of the Jurkat genome and allow for
`re-analysis of Jurkat variants as more information about
`their effects becomes available.
`Using our list of variants, we were also able to exten-
`sively search the literature for previously identified defects
`in Jurkat. We found a number of reports describing
`the same variants that we had independently identified,
`confirming the presence of these mutations in extramu-
`ral Jurkat samples. Uncovering these past publications
`required precise knowledge of damaged genes in Jurkat.
`They were difficult, if not impossible, to find using general
`queries and were published over a decade ago in a range
`of journals. Furthermore, with the exception of the PTEN
`and INPP5D defects, these reports had never been consol-
`idated into a single resource, making our documentation
`of previous reports the first review of damaged genes in
`the Jurkat cell line.
`The defects in these genes have the potential to con-
`found prior findings in Jurkat, but the loss is unlikely
`to put a dent in the vast amount of knowledge that we
`have gained from this cell line. In fact, Jurkat’s expression
`deficiencies open the door for reconstitution experiments
`that, in other systems, would first require suppression of
`the gene products. Many studies have already put this idea
`into action. Transgenic expression of INPP5D and SYK
`constructs has already generated breakthroughs in our
`understanding of their biological activities [26, 27, 37, 38].
`Likewise

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket