`
`Nucleic Acids Research, 1994, Vol. 22, No. 8 1327-1334
`
`Mutations to nonsense codons in human genetic disease:
`implications for gene therapy by nonsense suppressor
`tRNAs
`
`Jennifer Atkinson and Robin Martin*
`Krebs Institute for Biomolecular Research, The University of Sheffield, PO Box 594, Firth Court,
`Western Bank, Sheffield S10 2UH, UK
`
`Received January 28, 1994; Revised and Accepted March 7, 1994
`
`ABSTRACT
`Nonsense suppressor tRNAs have been suggested as
`potential agents for human somatic gene therapy.
`Recent work from this laboratory has described
`significant effects of 3' codon context on the efficiency
`of human nonsense suppressors. A rapid increase in
`the number of reports of human diseases caused by
`nonsense codons, prompted us to determine how the
`spectrum of mutation to either UAG, UAA or UGA
`codons and their respective 3' contexts, might effect
`the efficiency of human suppressor tRNAs employed
`for purposes of gene therapy. This paper presents a
`survey of 179 events of mutations to nonsense codons
`which cause human germline or somatic disease. The
`analysis revealed a ratio of approximately 1:2:3 for
`mutation to UAA, UAG and UGA respectively. This
`pattern is similar, but not identical, to that of naturally
`occurring stop codons. The 3' contexts of new
`mutations to stop were also analysed. Once again, the
`pattern was similar to the contexts surrounding natural
`termination signals. These results imply there will be
`little difference in the sensitivity of nonsense mutations
`and natural stop codons to suppression by nonsense
`suppressor tRNAs. Analysis of the codons altered by
`nonsense mutations suggests that efforts to design
`human UAG suppressor tRNAs charged with Trp, Gin,
`and Glu; UAA suppressors charged with Gin and Glu,
`and UGA suppressors which insert Arg, would be an
`essential step in the development of suppressor tRNAs
`as agents of human somatic gene therapy.
`
`INTRODUCTION
`Nonsense mutations cause the premature termination of protein
`synthesis, since in the normal course of translation, there are no
`aminoacyl-tRNAs whose anticodons match the UAG, UAA or
`UGA nonsense codons. Nonsense suppressors can be created
`however, by mutating the tRNA so that the suppressor is able
`to match one of the termination signals. A proportion of full
`length gene product is now produced. In 1982, Y. W. Kan and
`
`*To whom correspondence should be addressed
`
`colleagues published a paper in Nature reporting the construction
`of a human nonsense suppressor tRNA and the successful in vitro
`suppression of a UAG mutation at codon 17 of the 3-globin gene
`(1). The mRNA containing the nonsense mutation was obtained
`from a patient suffering from j3 thalassemia and it was suggested
`that nonsense suppression might one day prove to be a useful
`technique for the somatic gene therapy of human diseases caused
`by mutation to nonsense codons (1).
`Although there has been relatively little work in this area in
`the intervening years, there are several attractive aspects to such
`a strategy. First, tRNA genes have strong promoters, which are
`active in all cell types. The promoters for eukaryotic tRNA genes
`lie within the structural sequences encoding the tRNA molecule
`are elements which regulate
`(2).
`itself
`Although there
`transcriptional activity within the 5' upstream region (3), the
`length of an active transcriptional unit may be considerably less
`than 500 base pairs, and thus accommodation within a delivery
`vector presents no problem. Secondly, once they have been
`transcribed and processed, tRNAs have low rates of degradation.
`Finally, gene therapy with a nonsense suppressor would maintain
`the endogenous, physiological controls over the target gene which
`contains the nonsense codon. On the down side, nonsense
`suppressors may cause readthrough of natural stop codons. In
`addition, the presence of nonsense mutations can lead to the
`aberrant splicing of introns, and to reduced levels of complete
`mRNA (4,5). As these events are both nuclear in location, they
`are probably beyond the reach of cytoplasmic suppressors. Of
`course, only a fraction of mutations leading to human genetic
`disease are caused by nonsense mutations. However, if an
`effective mechanism for gene therapy by nonsense suppression
`could one day be developed, it would then be applicable to similar
`mutations in a wide range of genes.
`One aspect which was not considered in the in vitro experiments
`(1) was the context sensitivity of the efficiency of nonsense
`suppression. Recently, we have described the way in which the
`3' codon context affects the efficiency of UAG suppressor tRNAs
`in human tissue culture cells (6,7). In general, the efficiency of
`suppression varies according to the immediate 3' base in the
`pattern: C> G> U > A, although it is probable that there are
`effects of the next 3' base as well. The efficiency of nonsense
`
`GDX 1019
`
`
`
`1328 Nucleic Acids Research, 1994, Vol. 22, No. 8
`
`Table 1. Nonsense mutations in human genes resulting in genetic disease.
`
`5 codon
`
`CTG(leu)
`AGG(arg)
`AGA(arg)
`CTG(leu)
`GCA(ala)
`AAA(lys)
`AAG(lys)
`GCT(ala)
`AAG(lys)
`GGC(gly)
`TGC(cys)
`GTG(val)
`AGG(arg)
`TTC(phe)
`AAA(Iys)
`ACT(thr)
`ACA(thr)
`CTG(Ieu)
`CTG(leu)
`GCC(ala)
`CTG(leu)
`ACC(thr)
`GGC(gly)
`GTC(val)
`AAA(lys)
`TTT(phe)
`GTG(val)
`AGT(ser)
`TTC(phe)
`CTT(Ieu)
`CAA(gln)
`CAG(gln)
`ATA(ile)
`TCT(ser)
`AGC(ser)
`ACA(thr)
`ATG(met)
`GCA(ala)
`GAG(glu)
`AAG(lys)
`GTC(ala)
`CCA(pro)
`TGG(leu)
`TAT(tyr)
`CTA(leu)
`GTC(val)
`AAC(asn)
`TTG(leu)
`AAT(asn)
`ACA(thr)
`ATT(ile)
`GTA(val)
`TAT(tyr)
`GCT(ala)
`GCT(ala)
`CTT(leu)
`GTT(val)
`GCA(ala)
`TTT(phe)
`TTT(phe)
`AAG(lys)
`GAT(asp)
`TGT(cys)
`TGT(cys)
`TGG(trp)
`AGA(arg)
`TAT(tyr)
`ACC(thr)
`GGT(gly)
`CCT(pro)
`
`Affected
`codon
`
`TTG(leu)
`CAA(gln)
`CAA(gln)
`CAG(gln)
`GAA(glu)
`CAA(gln)
`TGG(trp)
`GAA(glu)
`TTA(leu)
`CGA(arg)
`CGA(arg)
`AAG(lys)
`CAG(gIn)
`CGA(arg)
`CAG(gln)
`TCA(ser)
`CGA(arg)
`TAC(tyr)
`TAC(tyr)
`TGG(trp)
`TGG(trp)
`CAG(gln)
`AAG(lys)
`TGT(cys)
`GAA(glu)
`GAG(glu)
`CAG(gln)
`GAG(glu)
`CAA(gln)
`GGA(gly)
`CGA(arg)
`TGG(trp)
`TGG(trp)
`CAG(gln)
`CGA(arg)
`TGG(trp)
`CGA(arg)
`TGC(cys)
`CGA(arg)
`GAA(glu)
`GAG(glu)
`TGG(trp)
`CGA(arg)
`TGG(trp)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CAG(gln)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CAA(gln)
`CGA(arg)
`GAA(glu)
`TGG(trp)
`CAG(gln)
`CAG(gln)
`GAG(glu)
`TGG(trp)
`TGT(cys)
`TGC(cys)
`CGA(arg)
`CAA(gln)
`CAA(gln)
`TGG(trp)
`
`3 'codon
`
`AGT(leu)
`AGT(ser)
`TCA(ser)
`GGT(gly)
`ATA(ile)
`ATT(ile)
`GCC(ala)
`CTG(leu)
`GTA(val)
`ATC(ile)
`CTC(leu)
`GTG(val)
`GAG(glu)
`GAG(glu)
`CAT(his)
`TCA(ser)
`CTC(leu)
`GAG(glu)
`GAG(glu)
`GGC(glyl)
`GCC(ala)
`AGG(arg)
`GTG(val)
`GTG(val)
`TTC(phe)
`TCC(ser)
`GCA/U(ala)
`CTG(leu)
`GAG(glu)
`GAA(glu)
`GCA(ala)
`AGG(arg)
`AAA(lvs)
`TTT(phe)
`GTC(val)
`AAC(asn)
`TCT(ser)
`CAA(gln)
`GAA(glu)
`CTT(leu)
`AA
`ACA(thr)
`TTC(phe)
`CAT(his)
`ATG(met)
`TTT(phe)
`AGC(ser)
`CAG(gln)
`AGC(ser)
`CAC(his)
`TGG(trp)
`AAA(lys)
`GGA(gly)
`TAC(tyr)
`CTT(leu)
`ATT(ile)
`GGG(gly)
`GAA(glu)
`AAC(asn)
`AAG(lys)
`TAT(tyr)
`TGT(cvs)
`TCC(ser)
`TGT(cys)
`CCC(pro)
`GAG(glu)
`CTT(leu)
`TCA(ser)
`TCC(phc)
`CAG(gln)
`
`Stop
`codon
`
`TAG
`TAA
`TAA
`TAG
`TAA
`TAA
`TGA
`TAA
`TAA
`TGA
`TGA
`TAG
`TAG
`TGA
`TAG
`TAA
`TGA
`TAG
`TAA
`TAG
`TGA
`TAG
`TAG
`TGA
`TAA
`TAG
`TAG
`TAG
`TAA
`TGA
`TGA
`TGA
`TAG
`TAG
`TGA
`TAG
`TGA
`TGA
`TGA
`TAA
`TAG
`TAG
`TGA
`TGA
`TGA
`TGA
`TGA
`TGA
`TAG
`TGA
`TGA
`TGA
`TGA
`TGA
`TGA
`TGA
`TAA
`TGA
`TAA
`TAG
`TAG
`TAG
`TAG
`TGA
`TGA
`TGA
`TGA
`TAA
`TAA
`TAG
`
`site
`
`Gene or disease
`
`L261X
`QI041X
`Q1067X
`Q1338X
`E1306X
`Q12X
`'W717X
`E358X
`Ll14OX
`R197X
`R129X
`K21I7X
`Q84X
`R2486X
`Q1450X
`S3750X
`RI9X
`Y37X
`Y37X
`W21OX
`W98X
`Q39X
`K17X
`ClI -12X
`El2lX
`E43X
`Q127X
`E90X
`Q309X
`G542X
`R553X
`W1282X
`W1I316X
`Q493X
`RI 162)X
`W846X
`R1 158X
`C524X
`nt251I0
`nt3714
`nt2522
`nt6002
`R-5X
`W225X
`R336X
`R427X
`R583X
`R795X
`Q1I686X
`R1696X
`R1941X
`R1I966X
`R21 16X
`R2147X
`R-2209X
`R2307X
`nt6406
`nt6460
`nt64712
`nt6688
`nt6693
`ntl10400
`nt10406
`nt10468
`nt10471
`nt17700
`ntl17'761
`nt20497
`nt2055
`nt20561
`
`Acid Spingonmyelinasc
`Adenomiatous polyposis coli(APC)
`APC-eastric cancer,
`APC
`APC
`AMP deamiinasc
`Androgen receptor
`Anti-miullerian Hormonic
`Antithrombinlll
`Antithronmbinlll
`Antithronmbinlll
`1e -anttitrvpsin(emphvscma)
`Apolipoprotein A-I
`Apolipoprotein B
`Apolipoprotein B
`Apolipoprotein B
`Apolipoprotein C-I!
`Apolipoprotcin CII
`Apolipoprotein CII
`Apolipoprotein E
`APRT deficiency
`Beta-globin(~3-thalassemia)
`Beta-globin(j3-thalassemia)
`Beta-globin(~3-thalassemia)
`Beta-globin(O3-thalassemia)
`Beta-globin(/3-thalassemia)
`Beta-globin(,3-thalassemia)
`Beta-globin(/3-thalassemia)
`Cholesteryi ester transfer protein
`Cystic fibrosis
`Cystic fibrosis
`Cystic f-ibrosis
`Cy,stic f-ibrosis
`CNystic fibrosis,
`Cvstic fibrosis
`Cy,stic f-ibrosis
`Cvstic fibrosis
`Cystic f-ibrosis
`Dv'strophin-DMD
`Dystrophin-DMD
`Dx'strophin-DMD
`Erythropoietin receptor(EPOR)
`Factor VIII(Haem-A)
`Factor VIII(HeamA)
`Factor VIII(HaemA)
`Factor VIII(HaemnA)
`Factor VIII(HaemrA)
`Factor VIII(HaemA)
`Factor VIII(Haem-A)
`Factor VIII(HaemA)
`Factor VIII(Haem-A)
`Factor VIII(HaemnA)
`Factor VIII(Haem-A)
`Factor VIII(Haem-A)
`Factor VIII(HaemriA)
`Factor VIII(HaemA)
`Factor IX(HaemB)
`Factor IX(HaenmB)
`Factor IX(HacmB)
`Factor IX(Haem-B)
`Factor 1X(Haem-B)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor- IX(HaemB)
`Factor- IX(HacmBd)
`Factor- IX(Haem-B)
`Factor IX(HaenmB)
`Factor IX(Haen-B)
`F.actor IX(HacemB)
`
`GDX 1019
`
`
`
`Nucleic Acids Research, 1994, Vol. 22, No. 8 1329
`
`CCT(pro)
`TGG(trp)
`TGT(cys)
`AAT(asn)
`AAA(lys)
`AAG(lys)
`ATT(ile)
`AAG(lys)
`GGC(gly)
`GGC(gly)
`CTT(leu)
`CAG(gln)
`GAC(asp)
`ACA(thr)
`CTT(leu)
`GAT(asp)
`GAA(gln)
`TTC(phe)
`AGC(ser)
`GAA(glu)
`AAC(asn)
`GAA(glu)
`TCA(ser)
`AGT(ser)
`GCC(ala)
`TGG(trp)
`CCA(pro)
`TTG(leu)
`GAT(asp)
`CTG(leu)
`GAC(asp)
`AGA(arg)
`CTC(leu)
`TGG(trp)
`CCC(pro)
`GGG(gly)
`ATC(ile)
`CAG(gln)
`CCA(pro)
`GTG(val)
`ATT(ile)
`CTT(leu)
`AAC(asn)
`CTT(leu)
`GTG(val)
`TTC(phe)
`CTG(leu)
`GTC(val)
`GGA(gly)
`ATG(met)
`AAA(lys)
`AGT(ser)
`AAG(lys)
`CTT(leu)
`CCC(pro)
`TTA(leu)
`CTT(leu)
`GCT(ala)
`GCT(ala)
`CCT(pro)
`GCC(ala)
`TTT(phe)
`TTT(phe)
`CCC(pro)
`TAT(tyr)
`CTG(leu)
`TGC(cys)
`CAC(his)
`GAG(glu)
`TTC(phe)
`GTG(val)
`ATC(ile)
`CCT(pro)
`CTG(leu)
`AAG(lys)
`
`TGG(trp)
`CAG(gln)
`GGA(gly)
`GAA(glu)
`TGG(trp)
`CGA(arg)
`CGA(arg)
`GAA(glu)
`TAT(tyr)
`TGG(trp)
`CAG(gln)
`TAC(tyr)
`CGA(arg)
`TGT(cys)
`CGA(arg)
`TCA(ser)
`GGA(gly)
`TTA(leu)
`TGG(trp)
`GAG(glu)
`TGG(trp)
`AAA(lys)
`CGA(arg)
`TGG(trp)
`TGC(cys)
`GAA(glu)
`GAA(glu)
`GAA(glu)
`CGA(arg)
`CGA(arg)
`GAG(glu)
`TGG(trp)
`CGA(arg)
`CGA(arg)
`TGG(trp)
`TGG(trp)
`GAA(glu)
`TAC(tyr)
`CAG(gln)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CAG(gln)
`CGA(arg)
`TAC(tyr)
`CAG(gln)
`TGC(cys)
`TAC(tyr)
`CAG(gln)
`TAT(tyr)
`TGG(trp)
`TGG(trp)
`TCA(ser)
`CGA(arg)
`TAT(try)
`TAC(tyr)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CAG(gln)
`AAG(lys)
`TGC(cys)
`CGA(arg)
`CAG(gln)
`GAG(glu)
`GGA(gly)
`CAA(gln)
`GAG(glu)
`GAA(glu)
`CGA(arg)
`GAA(glu)
`CGA(arg)
`GAG(glu)
`TGG(trp)
`CAG(gln)
`
`CAG(gln)
`GTA(val)
`GGC(gly)
`AAA(lys)
`ATT(ile)
`AAT(asn)
`ATT(ile)
`TCA(tyr)
`GTA(val)
`GGA(gly)
`TAC(tyr)
`CTT(leu)
`GCC(ala)
`CTT(leu)
`TCT(ser)
`TGT(cys)
`GAT(asp)
`ACT(thr)
`GGT(gly)
`TGT(cys)
`ATT(ile)
`ACA(thr)
`CTT(val)
`GAT(asp)
`ACC(thr)
`AAG(lys)
`AAC(asn)
`CTG(leu)
`GGG(gly)
`GAC(asp)
`AGC(ser)
`ACC(thr)
`GGT(gly)
`GAG(glu)
`CCT(pro)
`AAT(asn)
`AGG(arg)
`GTC(val)
`CCG(pro)
`ATC(ile)
`GGA(gly)
`G
`AGT(ser)
`GAG(glu)
`CTT(leu)
`TGC(cys)
`CTC(leu)
`AAC(asn)
`GAT(asp)
`GAG(glu)
`AAG(lys)
`GTG(val)
`GGC(gly)
`GAA(glu)
`AAT(asn)
`CCT(pro)
`GAG(glu)
`GTG(val)
`GTG(val)
`CAT(his)
`TCT(ser)
`CAA(gln)
`CAT(his)
`CCA(pro)
`CCG(pro)
`CGA(arg)
`CTG(leu)
`CTG(leu)
`GAG(glu)
`GAG(glu)
`GGA(gly)
`GTG(val)
`GTT(val)
`GTT(val)
`TCA(ser)
`
`TGA
`TAG
`TGA
`TAA
`TAG
`TGA
`TGA
`TAA
`TAA
`TGA
`TAG
`TAG
`TGA
`TGA
`TGA
`TGA
`TGA
`TGA
`TGA
`TAG
`TGA
`TAA
`TGA
`TGA
`TGA
`TAA
`TAA
`TAA
`TGA
`TGA
`TAG
`TGA
`TGA
`TGA
`TGA
`TAG
`TAA
`TAG
`TAG
`TGA
`TGA
`TGA
`TAG
`TGA
`TAG
`TAG
`TGA
`TAA
`TGA
`TAA
`TGA
`TAG
`TGA
`TGA
`TAA
`TAG
`TGA
`TGA
`TGA
`TAG
`TAG
`TGA
`TGA
`TAG
`TAG
`TGA
`TAA
`TAG
`TAA
`TGA
`TAA
`TGA
`TAG
`TGA
`TAG
`
`nt20562
`nt20563
`nt30072
`nt30090
`nt30097
`nt30863
`nt30875
`nt3 1001
`nt31039
`nt31051
`nt31091
`nt31096
`nt31118
`nt31129
`nt31133
`nt31200
`nt31208
`nt31257
`nt31276
`nt31283
`nt31342
`nt31352
`R185X
`nt5574
`C720X
`E375X
`E357X
`E364X
`R359X
`R186X
`E279X
`W343X
`R137X
`R393X
`W26X
`W171X
`exon2
`Y64X
`Q3 lOX
`R897X
`R372X
`R988X
`Q672X
`R1000X
`Y167X
`Q12X
`C660X
`Y83X
`Q106X
`Y61X
`W382X
`W64X
`S447X
`nt2746
`Y209X
`Y299X
`R426X
`R141X
`R109X
`Q192X
`K12OX
`C135X
`R213X
`Q317X
`E221X
`Y226X
`Q136X
`E298X
`E286X
`R342X
`E198X
`R196X
`E224X
`W146X
`Q19SX
`
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemnB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Factor IX(HaemB)
`Fanconi anemia-group C gene
`Fibrillin gene(Marfan syndrome)
`Fructose Intolerance-Aldolase B
`a-L-Fucosidase(fucosidosis)
`Fumarylacetoacetate hydrolase
`Fumarylacetoacetate hydrolase
`Glucocerebrosidase(Gaucher dis.)
`Glucokinase-NID diabetes
`Glucokinase
`Glycoprotein lb alpha
`f-hexosaminidase A-Tay Sachs
`f3-hexosaminidase A-Tay Sachs
`f-hexosaminidase A-Tay Sachs
`typeLl 33 hydroxysteroid dehydrog.
`Hypothyroidism TSH B subunit gene
`IDUA (Hurler syndrome)
`IDUA alpha-L-iduronidase
`Insulin receptor(leprechaunism)
`Insulin receptor(leprechaunism)
`Insulin receptor(diabetes)
`Insulin receptor Leprechaunism
`Insulin receptor
`LDL receptor(Hypercholesterolemia)
`LDL receptor
`LDL receptor(hypercholerterolemia)
`Lecithin cholesterol acyltransferase
`Lipoprotein lipase
`Lipoprotein lipase
`Lipoprotein lipase
`Lipoprotein lipase
`Lipoprotein lipase
`OCRL-1 oculocerebrorenal synd. Lowe
`Omithine aminotransferase
`Ornithine aminotransferase
`Omithine aminotransferase
`Ornithine transcarbamylase
`Omithine transcarbamylase
`p53 squamous cell carcinoma
`p53 Li Fraumeni syndrome
`p53 Hepatocellular carcinoma
`p53 Ovarian carcinoma, gastric tumour
`p53 Esophageal carcinoma
`p53 Osteocarcinoma
`p53 Ovarian carcinoma
`p53 Esophageal carcinoma
`p53 Hepatocellular carcinoma
`p53 Esophageal carcinoma
`p53 Breast cancer
`p53 Hepatocellular carcinoma
`p53 Fibrous histiocytoma
`p53 Ovarian carcinoma
`p53 Esophageal carcinoma
`p53 Esophageal carcinoma
`
`GDX 1019
`
`
`
`1330 Nucleic Acids Research, 1994, Vol. 22, No. 8
`
`Table 1. (continued).
`
`5'eodon
`
`GAG(glu)
`TTC(phe)
`TTA(leu)
`TCA(ser)
`CAT(his)
`TAC(tyr)
`CAG(gln)
`CTC(leu)
`CTT(leu)
`GGC(gly)
`?AC()
`ACC(thr)
`GCT(ala)
`AAG(lys)
`AAG(lys)
`TTC(phe)
`CTG(leu)
`CTC(leu)
`GGG(gly)
`ATC(ile)
`ATC(ile)
`GTC(val)
`'?AC()
`CTG(leu)
`AAG(lys)
`TGC(cys)
`GTC(val)
`CCC(pro)
`GAA(glu)
`CAG(gln)
`TCT(ser)
`GTC(val)
`CGG(arg)
`AAC(asn)
`
`Affected
`codon
`
`TAT(tyr)
`CGA(arg)
`TCA(scr)
`CGA(arg)
`GGA(glv)
`TGG(trp)
`TAC(tyr)
`CGA(arg)
`CGA(arg)
`TGG(trp)
`TAT(tyr)
`TGG(trp)
`CGA(arg)
`GAG(glu)
`CAG(gln)
`TGG(trp)
`CAG(gln)
`CGA(arg)
`TCA(ser)
`CGA(arg)
`CGA(arg)
`TGG(trp)
`TGG(trp)
`CAG(gln)
`TAC(tyr)
`CAG(gln)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`TAT(tyr)
`CGA(arg)
`CGA(arg)
`CGA(arg)
`
`3'codon
`
`TTG(leu)
`GTC(val)
`GAG(glu)
`GAT(asp)
`TCC(ser)
`TTT(phe)
`TGC(cvs)
`CCT(pro)
`GAT(asp)
`CAC(his)
`GAG(glu)
`GGA(gl)
`GGA(gly)
`GTC(val)
`CTG(leu)
`CCT(pro)
`GAG(glu)
`GGA(gly)
`GTG(val)
`GTG(val)
`GCC(ala)
`ATG(met)
`GCA(ala)
`ATG(met)
`CGC(arg)
`TTC(phe)
`GTG(val)
`GAG(glu)
`AGG(arg)
`AAG(lys)
`CTT(leu)
`CAG(gln)
`GCA(ala)
`GAA(glu)
`
`Stop
`eodonl
`
`TAG
`TGA
`TGA
`TGA
`TGA
`TAG
`TAA
`TGA
`TGA
`TAG
`TAA
`TAG
`TGA
`TAG
`TAG
`TAG
`TAG
`TGA
`TGA
`TGA
`TGA
`TAG
`TGA
`TAG
`TAA
`TAG
`TGA
`TGA
`TGA
`TGA
`TAA
`TGA
`TGA
`TGA
`
`Site
`
`GenIe or diSCIlsc
`
`Y205X
`p53 Ovarian carcinoma
`Phenvlalanine hvdrox\ lase(PKU
`R261X
`PhenvIalaninc hvdro xvlasc(PKU)
`S359X
`Phenylalanine hvdroxylalse(PKU)
`RI IX
`Y'272X
`Phenvialanine hvdroxvlase(PKU)
`WA326X
`Phenvialanine hvdrox\vlase(PKUI
`PhenrvLalaInine hvdroxvlase(PKU)
`Y356X
`R'243X
`Pheniallanine hvdrox\lasc PKl'
`R584X
`Platelet glvyoproteinlih
`WN'198X
`Porphobilinogen deamina"e
`Y145X
`Prion protein
`WN29X
`Proteit
`C (PROC)
`R732X
`Procoll'teen Il(COL2A I
`E249X
`Rhodopsin
`nt687
`SRY sex recvrsal
`Wr406X
`Steroid 21 hvdroxvlase
`Sterrid 21 hvdroxvlase
`Q318X
`Triosephosph lte isonerase-alnlemlia
`R189X
`S2213X
`tlmino transferase
`TIvyrosilne
`T!rosinc aitino transterasc
`R417X
`Tyrosine amtino transierase
`R52X
`WN' 78X
`Tyrosinase (oculocutaneolns aLlhinisnv
`W\'7 1VX receptor(X-linked NDI
`Ql l9X
`V2-Vasoprcssin receptor(diLahetes)
`nt970
`Vitamin D receptor(rickets)
`Vitamin D receptor(rickets)
`Q149X
`R2535X
`Von W'illebhrand Factor-
`Von Willebrand tvpelll
`R1659X
`W'TI-turnour suppressor-WNilnms tumour-
`nt 1084
`WTI-tumour supressor /nr fln-er3
`XP-A-Xeroderma pimenrtosa
`XP-A-Xeroderrrta pigtncntosa
`XP-A-Xeroderma pt2,1emletosi
`XP-A-Xerodermna pitmentosaI
`
`Y1 16X
`R207X
`R228X
`R21 IX
`
`Entries are sorted alphabetically according to the gene which has been mutated or the common namiie of the resulting disease. \Xhere the 3' and 5' context are nlot
`discernible from the paper describing the mutation they were determined t'rom the published sequenec or ftrom the ENIBL and Gcribihnk databases held at Dareshurv.
`UK. Where the site of the mutation is known, this is indicated as cither the number of the codon preceded by the altered amlaino acid (in single letter code), and
`followed by X to indicate a terminator, or alternatively, as the nucleotide (nt) which has heen mutated. This list cani be supplied annriotated with r-et'erences, on request
`to RM. by electronic mail or on receipt ot' an IBM type disc. The list in Table I is not cxhaustive. Others have independently published, and arc constantly updating,
`a database of 880 single base pair substitutions which give rise to human genetic discase ( 14). A traction of' these will be mutaltions to stop codons. That database
`does not however include information on the full 5' and 3' codon contexts.
`
`suppression can vary by as much as an order of magnitude
`between the most efficient and the least efficient 3' contexts
`(Phillips-Jones, Hill, Atkinson and Martin: In Preparation). This
`pattern of context effects in human cells is quite different to that
`(6,8). There are also significant
`which operates in E. coli
`differences in the efficiency of suppressors for either UAG, UAA
`and UGA codons (9). The successful application of nonsense
`suppressor tRNAs as agents for human gene therapy, might
`therefore depend on both the proportions of UAG, UAA and
`UGA codons, and the spectrum of 3' codon contexts, amongst
`nonsense mutations that give rise to human genetic disease.
`Moreover, the likelihood that suppressor tRNAs would give rise
`to detrimental effects by reading through natural termination
`codons, will be determined by the differential distribution of
`nonsense codons favourable for suppression, between the
`population of nonsense mutations, and the population of natural
`stop codons.
`Given the number of nonsense mutations which have been
`described in human genes since the original proposal (1), we
`believe it is now possible to review the pattem of mutations giving
`rise to premature translational termination, with an eye to the
`potential use of nonsense suppressors as agents of somatic gene
`
`therapy. In this communication, we have surveyed the literature
`for reports of point mutations which lead to nonsense codons in
`human genes, and compared the distribution of the three
`termination signals and their 3' contexts, with that of natural stop
`codons.
`
`RESULTS
`The spectrum of mutations to nonsense codons in human
`genetic disease
`A total of 179 unique point mutations to nonsense codons were
`identified in human genes from a search of literature reports in
`a CD-ROM data base. Of these, 2 1 were either germ line or
`somatic cell mutations in the tumour suppressor genes p53 and
`APC. The mutational events we identified are listed in Table 1.
`The affected codon and the encoded amino acid are givzen for
`the site of the mutation, and it's 5' and 3' neighbours. Genes
`are sorted alphabetically according to the most commonly used
`name for either the gene product. or the genetic disease. This
`list can be supplied, annotated with references, on request to RM.
`by electronic mail or on receipt of an IBM type disc.
`
`GDX 1019
`
`
`
`Table 2. The distribution of point mutations amongst codons with the potential to mutate to UAG, UAA
`or UGA stop codons in human genetic disease.
`
`Nucleic Acids Research, 1994, Vol. 22, No. 8 1331
`
`Stop
`
`TAG
`
`Nucleotide
`
`1st position
`
`2nd position
`
`3rd position
`
`TAA
`
`1st position
`
`2nd position
`
`3rd position
`
`TGA
`
`1st position
`
`2nd position
`
`3rd position
`
`Affected codon
`
`Number
`
`Base change
`
`C-T
`
`AAG Lys
`CAG Gln
`GAG Glu
`TCG Ser
`TGG Trp
`TTG Leu
`TAC Tyr
`TAT Tyr
`
`AAA Lys
`CAA Gin
`GAA Glu
`TCA Ser
`TTA Leu
`TAC Tyr
`TAT Tyr
`
`AGA Arg
`CGA Arg
`GGA Gly
`TCA Ser
`TTA Leu
`TGC Cys
`TGG Trp
`TGT Cys
`
`3
`23
`9
`0
`13
`I
`6
`1
`
`1
`1O
`14
`I
`1
`3
`5
`
`0
`55
`5
`4
`1
`5
`15
`3
`
`A:T-T:A
`C:G-T:A
`G:C-T:A
`
`G:C-A:T
`T:A-A:T
`C:G-G:C
`T:A-G:C
`
`A:T-T:A
`C:G-T:A
`G:C-T:A
`C:G-A:T
`T:A-A:T
`C:G-A:T
`T:A-A:T
`
`C:G-T:A
`G:C-T:A
`C:G-G:C
`T:A-G:C
`C:G-A:T
`G:C-A:T
`T:A-A:T
`
`*
`
`*
`
`*
`
`*
`
`*
`
`Entries in Table 1 were scored for the codon affected and the base change involved in mutation to the
`nonsense codon. Mutations arising from a C-T deamination are indicated by a *.
`
`X
`
`onoon;r on rro .1ltaT
`
`rno
`
`.~~~
`
`i7..l,-..h .f
`
`f*
`
`4 1,
`
`C,)
`
`C,
`
`C?
`
`0 6 0a
`
`J
`
`I
`
`>r
`
`UAC,
`
`JAA
`
`U GA
`
`Figure 1. The frequency with which UAG, UAA and UGA termination codons
`occur as human disease causing mutations compared with the frequency of UAG,
`UAA and UGA as natural stop codons. The frequency of termination codons
`produced by nonsense mutation was taken from Table 1. The frequency of naturally
`occurring stop codons was taken from a sample of 1422 genes kindly supplied
`by Paul Sharp and Andrew Lloyd.
`
`The 3' codon context of mutations to nonsense codons in
`human genetic disease
`The distribution of 3' codon contexts amongst the 179 instances
`of nonsense mutations is shown in Figure 2. The 3' codon context
`found around natural termination codons is also displayed. The
`pattern of 3' contexts amongst mutations to UAG and UAA are
`not significantly different from the 3' bases flanking natural UAG
`
`Figure 1 illustrates the frequency of mutations to the three
`termination codons amongst the mutant alleles listed in Table 1:
`UAG (31 %), UAA (18%) and UGA (51 %) . Figure 1 also shows
`the frequency of natural UAG, UAA and UGA codons used to
`terminate protein synthesis at the ends of human genes. In human
`cells, natural termination codon usage divides UAG (23%), UAA
`(30%) and UGA (47%) (10-12). Whilst UGA codons are the
`most frequent stop in both populations, the frequency of UAA
`terminators is greater for natural stops than amongst new
`mutations. The reverse is true for UAG. Overall, the two patterns
`are significantly different: (X2 = 12.1, P = 0.002).
`Table 2 shows the distribution amongst the possible base
`changes at 1st, 2nd or 3rd codon positions which lead to the
`creation of TAG, TAA and TGA mutations. TAG stops are
`derived largely from CAG (Gln) and TGG (Trp) codons, TAA
`mutations from CAA (Gln) and GAA (Glu), and TGA codons
`originate predominantly from mutations in CGA (Arg) and TGG
`(Trp). The C - T alteration far outweighs any other change which
`is seen. This is particularly so for mutations to TGA, for which
`the CGA (Arg) codon is especially susceptible. The reasons for
`this are thought to be well understood (13,14). C-T transition
`mutations are most likely caused by the spontaneous chemical
`deamination of cytosine to give uracil. This leads to a U:G
`mispair. U:G mispairs will become fixed as a C:G-T:A
`mutation, if DNA replication precedes the detection and removal
`of uracil by DNA uracil glycosylase. Where cytosine exists in
`mammalian genomes as 5-methyl cytosine, in the doublet CpG,
`cytosine deamination leads to a T:G mispair. The high rate of
`mutation at these sites suggests that the T:G mispair is less readily
`detected, or less faithfully repaired, than the U:G mispair.
`Conversely, methylation of cytosine at the 5 position, may elevate
`the rate of spontaneous deamination.
`
`GDX 1019
`
`
`
`1332 Nucleic Acids Research, 1994, Vol. 22, No. 8
`
`I,I
`
`I
`
`I-
`
`I
`
`I
`
`,-7
`
`Z,
`
`--
`
`I
`
`t
`
`I- -
`
`I
`
`Figure 2. The 3' context of human disease causing nonsense mutations conmpared to the 3' context ot natural stop codons. The 3' context of discase causing nonsense
`mutations was taken from Table 1. The frequency of the 3' context of naturally occurring stop codons was calculated from a saniple of 1422 genes kindly supplied
`by Paul Sharp and Andrew Lloyd.
`
`and UAA termination codons: (X= 7.2, P = 0.066, X
`0.072, P = 0.995 respectively). There is a significant difference
`however between new mutations to UGA and natural stops: (x2
`= 8.1, P = 0.043). There is a lower frequency of A, and a higher
`representation of G 3' to natural UGA stop codons, than in new
`mutations to UGA. There is no difference in the pattern of 3'
`contexts between nonsense mutations and natural stops when
`UAG, UAA and UGA are combined: (x2 = 3.6, P = 0.303).
`
`DISCUSSION
`We present in this paper a survey of mutations to nonsense codons
`which give rise to human somatic cell and germ line diseases.
`As early as 1982, it was suggested that gene therapy of this class
`of disease loci might be attempted with human tRNA genes
`mutated to recognise stop codons (1). Readthrough at the
`nonsense mutation, by the suppressor, will restore a proportion
`of wild type gene function. Given the rapid progress being made
`in the identification of different nonsense mutations in human
`genes, and recent findings on the determination of suppressor
`efficiencies, it seems an appropriate moment to describe the
`pattems of mutation which occur and relate these to the possibility
`of suppressor tRNA gene therapy. In particular, experiments with
`reporter gene constructs have revealed differences
`in the
`
`effectiveness of suppressors according to which of the three
`codons UAG, UAA or UGA is to be read, and also the contexts
`in which these termination signals lie (6,7,9). This survey reveals
`that nonsense mutations occur in an approximate ratio of 1:2:3,
`for UAA, UAG and UGA respectively. Studies with human
`nonsense suppressors (9) suggest that suppressor efficiency varies
`UAG =UGA > UAA. The two most efficient suppressors can
`therefore recognise some 80% of nonsense mutations which lead
`to human genetic disease.
`When a suppressor tRNA reads a stop codon, the amino acid
`which is inserted is determined by the identity of the tRNA whose
`anticodon was mutated to match the termination triplet. At some
`sites, it might not matter which amino acid is inserted, so long
`as as translation is restored for the full length of the gene. At
`other sites, it might be important to restore authentic, wild type
`gene product. In this case the suppressor has to insert the amino
`acid corresponding to the codon in the unmutated gene. Our
`analysis reveals that C:G-T:A transitions predominate in the
`formation of stop codons. Trp. Gln and Glu codons are changed
`most frequently to UAG; Glu and Gln codons are changed most
`frequently to UAA; and overwhelmingly it is Arg and to a lesser
`extent Trp codons which give rise to UGA. To be widely
`applicable then, suppressor gene therapy would have to generate
`efficient suppressors from Trp. Gln, Glu and Arg tRNAs. Studies
`
`GDX 1019
`
`
`
`on the determination of tRNA 'identity elements', have shown
`that those bases in a tRNA molecule which are responsible for
`binding to the correct aminoacyl-tRNA synthetase enzyme,
`sometimes lie in the anticodon loop (15). Thus, when nonsense
`suppressors are created by mutagenesis of bases in this region,
`the tRNA may be charged with a different amino acid. Upon
`translation of a nonsense codon, this restores a normal length
`protein, but one which contains an amino acid substitution. For
`example, in E.coli UAG nonsense suppressors derived from
`tRNAtrP are charged with Gln as well as Trp (16). Rapid
`advances are being made in this area. For bacterial tRNAs, it
`is now largely known for which tRNAs mutation to a nonsense
`suppressor gives rise to altered amino acid insertions (17).
`Interestingly, site directed mutagenesis can been used to control
`the extent of mischarging, and retain tRNA aminoacyl identity
`( 18). It should not be long before similar information is available
`for human tRNAs. Research with bacterial tRNAs, has also
`established that the strongest nonsense suppressors are formed
`by altering the anticodon of tRNAs which normally read codons
`beginning with U (19). Whilst it is anticipated that similar rules
`will apply to human tRNAs, little work has been carried out on
`this aspect.
`Recent studies from this laboratory have established that the
`3' codon context has a substantial effect on the efficiency of
`human UAG suppressor tRNAs in human cells (6,7). It seems
`likely that similar rules will apply to UGA codons (20). Our
`researches have shown that UAG codons flanked by 3' A are
`very inefficiently suppressed, whereas those followed by a 3'
`C or G are suppressed some five to ten fold more efficiently for
`a given concentration of tRNA. In prokaryote and lower
`eukaryote organisms it is believed that the choice between the
`three termination codons and their 3' codon contexts, is under
`translational selection pressure (1 1, 12,21,22). In contrast, we and
`others, have argued that in mammalian cells, 3' termination codon
`contexts are shaped by mutation, and not by selection for optimum
`performance (23). This contention is reinforced by the present
`study. Mutations to nonsense codons in human disease loci are
`found in a similar range of 3' contexts to that observed for natural
`stop codons. Nonsense mutations in the human genome are fairly
`evenly divided between 3' contexts of A, C, G or U. In general,
`3' G is most common and 3' U is least frequently observed. This
`distribution of bases matches very well the distribution observed
`3' to natural stop codons (23). These patterns are largely
`determined by the local G + C content of the human genome,
`which is known to consist of substantial blocks or 'isochores'
`of sequences which differ widely in their richness for G + C
`(24,25). Given that the proportions of UAG, UAA and UGA
`are similar for new mutations and natural stop codons, the balance
`of probabilities is that termination codon choice, is not subject
`to translational selection in human cells either.
`The findings of this study have important implications for
`assessing the likelihood that suppressor tRNAs will be detrimental
`to the physiology of the cell, if they cause readthrough at a
`significant number of natural termination codons. C-terminal
`extended species may be degraded prematurely, they may have
`reduced enzyme activities, or they could display codominant,
`negative properties in their interaction with other proteins. Even
`short C-terminal extensions can have serious consequences for
`some polypeptides. For example, mutations which eliminate the
`natural stop codon of the x-globin gene give rise to a C-terminal
`extension of 31 amino acids. This causes a severe, dominant form
`of thalassemia (4). Of course, in the case of gene therapy by a
`
`Nucleic Acids Research, 1994, Vol. 22, No. 8 1333
`
`suppressor tRNA, the level of the tRNA could be adjusted so
`that readthrough by at a natural stop codon may be as little as
`5-10%, if this concentration of suppressor proved sufficient to
`reverse the mutant phenotype. Readthrough of this intensity at
`natural termination codons, may not present so drastic an
`outcome, in the presence of 90-95% of correctly terminated
`polypeptide chains.
`This review of nonsense mutations and natural stop codons,
`suggests that both populations are similar in their proportions
`of UAG, UAA and UGA, and in the distributions of their 3'
`contexts. Where differences exist, these are in favour of
`suppression therapy. UAG and UGA mutations account for 82 %
`of human mutations to stop, whereas UAG and UGA comprise
`only 70% of natural termination codons. Contrary to some earlier
`suggestions (26), natural stop codons in human cells do not seem
`to be protected in any special way from translational readthrough
`by their immediate 3' contexts. Studies have shown that there
`is no significant evidence to support the widespread belief that
`multiple stop codons are employed by cells to provide a fail-safe
`mechanism for terminating protein synthesis (22,27). There are
`indications from E. coli though, that the nature of the C-terminal
`amino acids within the nascent polypeptide, can influence the
`efficiency of translational termination (28,29). Moreover, surveys
`of bacterial gene sequences have suggested preferences for certain
`amino acids at the C-terminus, which could reflect on the
`efficiency of stop decoding (11,30). If C-terminal amino acids
`are selected to improve the efficiency of translational termination
`in human cells, this could increase the specificity of nonsense
`suppressors for stop mutations over natural termination codons.
`However, this appears unlikely in the light of the studies which
`show that the counterparts to bacterial preferences in mRNA
`sequences relating to codon usage and 3' codon context effects,
`are missing in human cells (23,31).
`
`ACKNOWLEDGEMENTS
`JA is the recipient of an MRC postgraduate studentship. RM is
`supported by a Royal Society University Research Fellowship.
`The Krebs Institute is a SERC centre for molecular recognition.
`This work benefited from the use of the SEQUENET facility.
`
`