`
`Mutation in the DNA mismatch
`repair gene homologue hMLH 1
`is associated with hereditary
`non-polyposis colon cancer
`
`C. Eric Bronner*, Sean M. Baker*,
`Paul T. Morrlsont, Gwynedd Warrent,
`Leslie G. Smith*, Mary Kay Lescoe§,
`Michael Kane II, Christine Earabinot,
`James Lipfordll, Annika Lindblom~,
`Pia Tannergard~l, Roni J. Bollagt #,
`Alan R. Godwint #, David C. Wardt**,
`Magnus Nordenskj~ld~, Richard Fishel§,
`Richard Kolodnerllt & R. Michael Liskay*tt
`
`* Department of Molecular and Medical Genetics,
`L103, Oregon Health Sciences University,
`3181 S.W. Sam Jackson Park Road, Portland,
`Oregon 97201-3098, USA
`t Molecular Biology Core Facility, Dana Farber Cancer Institute,
`Boston, Massachusetts 02115, USA
`t Department of Genetics, and ** Molecular Biophysics and
`Biochemistry, Yale University School of Medicine, New Haven,
`Connecticut 06510, USA
`§ Department of Microbiology and Molecular Genetics,
`Markey Center for Molecular Genetics, University of Vermont,
`Burlington, Vermont 05405, USA
`11 Division of Cellular and Molecular Biology,
`Dana Farber Cancer Institute, and Department of Biological Chemistry
`and Molecular Pharmacology, Harvard Medical School, Boston,
`Massachusetts 02115, USA
`~i Department of Molecular Medicine, Karolinska Hospital,
`S-171 76 Stockholm, Sweden
`
`THE human DNA mismatch repair gene homologue, hMSH2, on
`chromosome 2p is involved in hereditary non-polyposis colon
`2
`cancer (HNPCC)1
`• On the basis of linkage data, a second
`'
`HNPCC locus was assigned to chromosome 3p21-23 (ref. 3). Here
`we report that a human gene encoding a protein, hMLHl (human
`MutL homologue), homologous to the bacterial DNA mismatch
`repair protein MutL, is located on human chromosome 3p21.3-23.
`We propose that hMLHI is the HNPCC gene located on 3p
`because of the similarity of the hMLHl gene product to the yeast
`DNA mismatch repair protein, MLHJ 4•S, the coincident location
`of the hMLHI gene and the HNPCC locus on chromosome 3,
`and hMLHI missense mutations in affected individuals from a
`chromosome 3-Iinked HNPCC family.
`An unidentified human gene located on chromosome 3p21-23
`is involved in some HNPCC kindreds 3
`. Examination of tumour
`DNA from the chromosome 3-linked kindreds revealed di(cid:173)
`nucleotide repeat instability similar to that observed for other
`10
`HNPCC families 6 and several types of sporadic tumours 7
`.
`-
`Because dinucleotide repeat instability is characteristic of a
`12
`11
`defect in DNA mismatch repair5
`, HNPCC linked to chro(cid:173)
`'
`'
`mosome 3p21-23 could result from a mutation in a second DNA
`mismatch repair gene. Repair of mismatched DNA in Escher(cid:173)
`ichia coli requires a number of genes, including mutS, mutL and
`mutH. Defects in any one of these genes result in a general
`elevation of spontaneous mutation rates 13
`. Studies with the yeast
`Saccharomyces cerevisiae have identified three DNA mismatch
`repair genes: a mutS homologue, MSH2 14
`, and two mutL homo(cid:173)
`17 and M LH 14
`logues, PM S 115
`• Each of these three genes plays
`-
`
`tt To whom correspondence should be addressed.
`# Present addresses: Institute for Molecular Medicine and Genetics. CB-2803 Sanders
`Research and Education Building Medical College of Georgia. Augusta, Georgia 30912, USA
`(R.J.B.I; Department of Human Genetics. Bldg 533, Suite 5400, University of Utah. Salt Lake
`City, Utah 84112. USA (A.R.G.).
`
`an indispensable role in DNA replication fidelity, including the
`stabilization of dinucleotide repeats5
`. Because the DNA mis(cid:173)
`match repair gene homologue hMSH2 is the chromosome 2p
`12
`2
`HNPCC gene 1
`' other mismatch repair homologues, such as
`•
`•
`those related to mutL, are logical candidates for the HNPCC
`gene on chromosome 3.
`To clone mammalian MLH genes, we used polymerase chain
`reaction (PCR) techniques such as those used to identify the
`yeast MSHJ, MSH2 and MLHJ genes and the human MSH2
`2
`4
`14
`gene 1
`• As template in the PCR, we used complementary
`'
`'
`'
`DNA synthesized from poly(A)+-enriched RNA prepared from
`cultured primary human fibroblasts. The degenerate oligo(cid:173)
`nucleotides we used were targeted at the amino-terminal amino(cid:173)
`acid sequences KELVEN and GFRGEA (single-letter code; Fig.
`1), two of the most conserved regions of the MutL family of
`18
`19
`proteins previously described for bacteria and yease 6
`. Two
`•
`•
`PCR products of the expected size were identified, cloned,
`sequenced and shown to encode a predicted amino-acid sequence
`with homology to MutL-like proteins. These two fragments
`generated by PCR were used to isolate human eDNA and gen(cid:173)
`omic DNA clones, one of which, hMLHJ, is described here.
`The hMLHJ eDNA nucleotide sequence shown in Fig. I
`encodes an open reading frame 2,268 base pairs (bp). Figure I
`also shows the hMLHl protein, which consists of 756 amino
`acids and shares 41% identity with the protein product of the
`yeast DNA mismatch repair gene, MLH1 4
`. The regions of the
`hMLHI protein most similar to yeast MLHI correspond to
`amino acids 11-317, showing 55% identity, and the last 13 amino
`acids, which are identical between the two proteins. Further(cid:173)
`more, the predicted amino-acid sequences of the human and
`mouse MLHI proteins show at least 74% identity (data not
`shown).
`As a first step to determine whether hMLHJ was a candidate
`for the HNPCC locus on human chromosome 3p21-23 3
`, we used
`two separate hMLHJ genomic fragments to map the hMLHJ
`21
`gene by fluorescence in situ hybridization (FISH) 20
`• Examina(cid:173)
`'
`tion of several metaphase chromosome spreads
`localized
`hM LH 1 to chromosome 3p21.3-23 (Fig. 2). As independent con(cid:173)
`firmation of the location of hM LH 1 on chromosome 3, we used
`PCR with a pair of hMLHJ-specific oligonucleotides and
`Southern blotting with a hM LH ]-specific probe to analyse DNA
`from the NIGMS2 rodent/human cell panel (Coriell Institute
`for Medical Research, Camden, New Jersey, USA). Results of
`both techniques indicated chromosome 3 linkage (data not
`shown). We also mapped the mouse MLHJ gene by FISH to
`chromosome 9 band E (data not shown); this is a position of
`synteny to human chromosome 3p22
`. Therefore, the hMLHJ
`gene localizes to 3p21 .3-23, within the genomic region implicated
`in chromosome 3-linked HNPCC families 3
`.
`Next we analysed blood samples from affected and unaffected
`two chromosome-3 candidate HNPCC
`individuals
`from
`families 3 for mutations. Family I showed significant linkage
`(l.o.d. score= 3.01 at recombination fraction of 0) between
`HNPCC and a marker on 3p. For family 2, the reported l.o.d.
`score (1.02) was below the commonly accepted level of signifi(cid:173)
`cance, and thus only suggested linkage to the same marker on
`3p. Linkage analysis of family 2 with the microsatellite marker
`D3S1298 on 3p21.3 gave a more significant l.o.d. score of 1.88
`at a recombination fraction of 0 (unpublished data). Initially,
`we screened for mutations in two PCR-amplified exons of the
`hMLHJ gene by direct DNA sequencing. In these two exons
`from three affected individuals of family I there were no differ(cid:173)
`ences from the expected sequence. In family 2, four individuals
`affected with colon cancer are heterozygous for aCto T substitu(cid:173)
`tion in an exon encoding amino acids 41-69, which corresponds
`to a highly conserved region of the protein (Fig. 3). For one
`affected individual, we screened PCR-amplified eDNA for addi(cid:173)
`tional sequence differences. The combined sequence information
`obtained from the two exons and eDNA of this one affected
`individual represents 95% (all but the first 116 bp) of the open
`
`258
`
`© 1994 Nature Publishing Group
`
`NATURE · VOL 368 · 17 MARCH 1994
`
`GDX 1017
`
`
`
`reading frame . We observed no nucleotide changes other than
`the C to T substitution. Also, four members of family 2, pre(cid:173)
`dicted by linkage data to be carriers (data not shown), and as
`yet unaffected, were heterozygous for the same C to T substitu(cid:173)
`tion. Two of these predicted carriers are below and two are
`
`a
`
`6 0
`
`12 0
`
`CTTGGCTCTTCTGGCGCCAAAATGTCGTTCGTGGCAGGGGTTATTCGGCGGCTGGACGAG
`M S F V A G V
`I R R L D E
`ACAGTGG'I'GAACCGCATCGCGGCGGGGGAAGT'l'ATCCAGCGGCCAGCTAATGCTATCAAA
`TVVNR IAAGEV I QRPANA I K
`GAG,ATGATTQAGMCTGTTTAGATGCAAAATCCACAAGTATTCAAGTGATTGTTAAAGAG 180
`E M
`I
`E N C L D A K S T S
`I
`0 V
`I V K E
`GGAGGCCTGAAGTTGATTCAGATCCAAGACAA'l'GGCACCGGGATCAGGAAAGAAGATCTG 24 0
`G G L K L
`I Q
`I
`0
`0 N G T G
`I R K E D L
`GATATTGTATGTGAAAGGTTCACTACTAGTAAACTGCAGTCCTTTGAGGATTTAGCCAGT
`D
`I V C E R F T T S K L Q S F E 0 L A S
`ATTTCTACCTATGGCTTTCGAGGTGAGGCTTTGGCCAGCATAAGCCATGTGGCTCATGTT 360
`I
`S T Y 0
`F R G E A L A S
`I
`S H V A H V
`ACTATTACAACGAAAACAGCTGATGGAAAGTGTGCATACAGAGCAAGTTACTCAGATGGA
`T
`I
`K
`T
`S Y
`S D G
`A D G K C A
`T
`T
`Y R A
`AAACTGAAAGCCCCTCCTAAACCATGTGCTGGCAATCAAGGGACCCAGATCACGGTGGAG
`K L K A P P K P C A G N 0 G T 0
`I T V E
`GACCTTTTTTACAACATAGCCACGAGGAGAAAAGCTTTAAAAAATCCAAGTGAAGAATAT 54 0
`D L F Y N
`I A T R R K A L K N P S E E Y
`GGGAAAATTTTGGAAGTTGTTGGCAGGTATTCAGTACACAATGCAGGCATTAGTTTCTCA 600
`S
`G K
`I
`L E V V G R Y S V H N A G
`I
`S F
`GTTAAAAAACAAGGAGAGACAGTAGCTGATGTTAGGACACTACCCAATGCCTCAACCGTG
`V K K 0 G E T V A D V R T L P N A S T V
`GACAATATTCGCTCCATCTTTGGAAATGCTGTTAGTCGAGAACTGATAGAAATTGGATGT
`I G C
`D N
`I R S
`I
`F G N A V S R E L
`I
`E
`GAGGATAAAACCCTAGCCT'l'CAAAATGAATGGTTACATATCCAATGCAAACTACTCAGTG
`E D K T L A F K M N G Y
`I
`S N A N Y S V
`AAGAAGTGCATCTTCTTACTCTTCATCAACCATCGTCTGGTAGAATCAACTTCCTTGAGA
`K K C
`I
`F L L F
`I N H R L V E S T S L R
`AAAGCCATAGAAACAGTGTATGCAGCCTATTTGCCCAAAAACACACACCCATTCCTGTAC
`K A
`I
`E T V Y A A Y L P K N T H P
`F L Y
`CTCAGTTTAGAAATCAGTCCCCAGAATGTGGATGTTAATGTGCACCCCACAAAGCATGAA
`L S L E
`I
`S P Q N V D V N V H P T K H E
`GTTCACTTCCTGCACGAGGAGAGCATCCTGGAGCGGGTGCAGCAGCACATCGAGAGCAAG
`0 Q H
`V H F L H E E S
`I
`L E R V
`I
`E
`S K
`CTCCTGGGCTCCAATTCCTCCAGGATGTACTTCACCCAGACTTTGCTACCAGGACTTGCT
`L L G S N S
`S R M Y F T Q T L L P G L A
`GGCCCCTCTGGGOAGATGGTTAAATCCACAACAAGTCTGACCTCGTCTTCTACTTCTGGA
`G
`P
`S G E M V
`K
`S
`T
`T
`S
`L
`T
`S
`S
`S
`T
`S G
`AGTAGTGATAAGGTCTATGCCCACCAGATGGTTCGTACAGATTCCCGGGAACAGAAGCTT
`S
`S D K V Y A H 0 M V R T D S R E
`0 K L
`GATGCATTTCTGCAGCCTCTGAGCAAACCCCTGTCCAGTCAGCCCCAGGCCATTGTCACA
`D A F L Q P L S K P L S S Q P
`0 A
`I V T
`GAGGATAAGACAGATATTTCTAGTGGCAGGGCTAGGCAGCAAGATGAGGAGATGCTTGAA
`S S G R A R Q 0 D E E M L E
`E D K T D
`I
`CTCCCAGCCCCTGCTGAAGTGGCTGCCAAAAATCAGAGCTTGGAGGGGGATACAACAAAG
`L P A P A E V A A K N Q S L E G D T T K
`GGGACTTCAGAAATGTCAGAGAAGAGAGGACCTACTTCCAGCAACCCCAGAAAGAGACAT
`G T S E M S E K R G P T S S N P R K R H
`CGGGAAGATTCTGATGTGGAAATGGTGGAAGATGATTCCCGAAAGGAAATGACTGCAGCT
`R E D S D V E M V E D D S R K E M T A A
`TGTACCCCCCGGAGAAGGATCATTAACCTCACTAGTGTTTTGAGTCTCCAGGAAGAAATT
`C T P R R R
`I
`I N L T S V L S L 0 E E
`I
`AATGAGCAGGGACATGAGGTTCTCCGGGAGATGT'l'GCATAACCACTCCTTCGTGGGCTGT
`0 G H E V L R E M L H N H S F V G C
`N E
`GTGAATCCTCAGTGGGCCTTGGCACAGCATCAAACCAAGTTATACCTTCTCAACACCACC
`V N P Q W A L A Q H Q T K L Y L L N T T
`AAGCTTAGTGAAGAACTGTTCTACCAGATACTCATTTATGATTTTGCCAATTTTGGTGTT
`K L S E E L F Y Q
`I
`L
`I Y 0
`F A N F G V
`CTCAGGTTATCGGAGCCAGCACCGCTCTTTGACCTTGCCATGCTTGCCTTAGATAGTCCA
`L R L S E P A P L F D L A M L A L D S P
`GAGAGTGGCTGGACAGAGGAAGATGGTCCCAAAGAAGGACTTGCTGAATACATTGTTGAG
`E S
`0 W T E E D G P K E G L A E Y
`I V E
`TTTCTGAAGAAGAAGGCTGAGATGCTTGCAGACTATTTCTCTTTGGAAATTGATGAGGAA
`F L K K K A E M L A D Y F
`S L E
`I D E E
`GGGAACCTGATTGGATTACCCCTTCTGATTGACAACTATGTGCCCCCTTTGGAGGGACTG
`G N L
`I G L P L L
`I D N Y V P P L E G L
`CCTATCTTCATTCTTCGACTAGCCACTGAGGTGAATTGGGACGAAGAAAAGGAATGTTTT
`P
`I
`F
`I L R L A T E V N W D E E K E C F
`GAAAGCCTCAGTAAAGAATGCGCTATGTTCTATTCCATCCGGAAGCAGTACATATCTGAG
`E S L S K E C A M F Y S
`I R K 0 Y
`I
`S E
`GAGTCGACCCTCTCAGGCCAGCAGAGTGAAGTGCCTGGCTCCATTCCAA.ACTCCTGGAAG
`E S T L S G 0
`0
`S E V P G S
`I
`P N S W K
`TGGACTGTGGAACACATTGTCTATAAAGCCTTGCGCTCACACA'l"TCTGCCTCCTAAACAT
`W T V B H
`I V Y K A L R S H
`I
`L P P K H
`TTCACAGAAGATGGAAATATCCTGCAGCTTGCTAACCTGCCTGATCTATACAAAGTCTTT
`F T E D G N
`I
`L 0 L A N L P
`0 L Y K V F
`GAGAGGTGTTAAATATGGTTATTTATGCACTGTGGGATGTGTTCTTCTTTCTCTGTATTC
`E R C
`CGATACAAAGTGTTGTATCAAAGTGTGATATACAAAGTGTACCAACATAAGTGTTGGTAG
`CACTTAAGACTTATACTTGCCTTCTGATAGTATTCCTTTATACACAGTGGATTGATTATA
`AATAAATAGATGTGTCTTAACATA
`
`300
`
`4 2 0
`
`4 8 0
`
`6 6 0
`
`720
`
`780
`
`840
`
`900
`
`960
`
`102 0
`
`1080
`
`1140
`
`1200
`
`1260
`
`1320
`
`1380
`
`144 0
`
`1500
`
`1560
`
`1620
`
`1680
`
`1740
`
`1800
`
`1860
`
`1920
`
`1980
`
`2040
`
`2100
`
`2160
`
`2220
`
`2 2 80
`
`2340
`
`2400
`2460
`2484
`
`LETTERS TO NATURE
`
`above the mean age of onset (50 yr) in this particular family .
`Two unaffected individuals examined from this same family,
`both predicted to be non-carriers (data not shown), showed the
`expected normal sequence at this position. Linkage analysis that
`utilizes the C to T substitution in family 2 gives a l.o.d. score of
`2.23 at a recombination fraction 0. Using low-stringency cancer
`diagnostic criteria, we calculated a I.o.d. score of 2.53. These
`data indicate that the C toT substitution shows significant link(cid:173)
`age to the HNPCC in family 2.
`To determine whether this C to T substitution was a poly(cid:173)
`morphism , we sequenced the same exon amplified from the gen(cid:173)
`omic DNA from 48 unrelated individuals and observed only the
`normal sequence. We have examined an additional 26 unrelated
`individuals using allele-specific oligonucleotide (ASO) hybrid(cid:173)
`ization analysis23
`. None of these 74 unrelated individuals carry
`the C to T substitution. Therefore, the C to T substitution
`observed in family 2 individuals is not likely to be a polymorph(cid:173)
`ism. We did not detect this same C to T substitution in affected
`individuals from the second chromosome 3-linked family, family
`13 We are continuing to study family I for mutations in hMLHI.
`
`FIG . 1 Structure of human MLHl . a, The nucleotide sequence of the
`hMLHl eDNA and the amino-acid sequence of the protein predicted to
`be encoded by the hMLHl eDNA. The underlined DNA sequences are
`the regions of eDNA that correspond to the degenerate PCR primers
`that were originally used to amplify a portion of the MLHl gene (nucleo(cid:173)
`tides 118-135 and 343-359). b, Alignment of the predicted human
`MLH1 and S. cerevisiae MLH1 protein sequences. Amino-acid identities
`are indicated by boxes, and gaps are indicated by dashes. c, Phylog(cid:173)
`enetic tree of Mutl-related proteins. The phylogenetic tree was con(cid:173)
`structed using the predicted amino-acid sequences of seven Mutl(cid:173)
`related proteins: human MHL1; mouse MLH1; S. cerevisiae MLH1; S.
`cerevisiae PMS1; E. coli Mutl; S. typhimurium MutL and S. pneumoniae
`HexB. The required sequences were obtained from GenBank release
`7.3.
`METHODS. Human Mutl-related sequences were amplified by PCR
`using degenerate oligonucleotide primers (5'-CTTGATTCTAGAGC(T / C)(cid:173)
`TCNCCNC(T / G)(A/ G)AANCC-3' and 5' -AGGTCGGAGCTCAA(A/ G)GA(A/ G)(cid:173)
`(T/ C)TNGTNGANAA-3'). The template for PCR was double-stranded eDNA
`synthesized from cultured primary human fibroblast poly(At mRNA. PCR
`was in 50 111 containing eDNA template, 1.0 mM each primer, 5 IU of
`Taq polymerase (Cetus), 50 mM KCI, 10 mM Tris buffer pH 7.5 and
`1.5 mM MgCI. PCR was for 35 cycles of 1 min 94 "C, 1 min at 43 oc
`and 1.5 min at 62 "C. Fragments of the expected size, -212 bp, were
`cloned into pUC19 and sequenced. The cloned MLHl PCR product was
`labelled with a random primer labelling kit (RadPrime, Gibco BRL) and
`used to probe human eDNA and genomic cosmid libraries by standard
`procedures. DNA sequencing of double-stranded plasmid DNAs was
`as previously described 1
`. DNA sequence alignments and contiguous
`sequences were constructed using Sequencher 2.0.10 (Gene Codes
`Corporation, Ann Arbor, Michigan, USA). The pairwise protein sequence
`Inc., Madison,
`alignment was with DNAStar MegAiign (DNASTAR
`Wisconsin, USA) using the clustal method30
`. Pairwise alignment param(cid:173)
`eters were ktuple of 1, gap penalty of 3, window of 5 and diagonals of
`5. The phylogenetic tree was generated with the PILEUP program of the
`Genetics Computer Group software using a gap penalty of 3 and a
`length penalty of 0.1. The reported DNA sequence has been submitted
`to GenBank (accession number U07343).
`
`b
`HUMl\N i<SFVA~~~Q~.oWjJj{Ei.l:Jnlc~l:(llr:N:r!iKID<Jrflr.~JIBIWlrJ~~~ 100
`YEAST tiSLR---IJ~Ii)I~S~~DI~~~lti~JbmJ~ 97
`
`HUMAN ~.ItrTifrAJX;fG.~s~~PS~svflNAcill~ET 200
`YEAST ~D~~~F\TIJP~~~~AiijS~DS 197
`
`HrnAN EMVEfDQSRJ<EMrAA---- -crfilm\I-~INEQCil~S~PQWJoM--5i!~I.NITKLSEfiFY6iiLI~ 571
`YEAST PSIJt>JiEKNALPISKD:}YIRW)~I~DS !tiREtJTDIF~EE.RJtb:rb!:Jo~IOYG~L~ 581
`
`HUMl\N 6vLRilSEPAPLFii!I..I\MLl'I[)Q.'!filsGWrEEl:lc~LAEYiiivu~ullrm:Etl---- --- -l'ilrc£PLrlra~IAiMlEdi:ilrfllrjii)A'rlil 663
`YEAST IJKntlQSTNVSrtJIVLYl'lUtiJ-b3FDE~- -- -Il!ISKIWDiS~JI3LVNQ;iLDNDLKSVKiUK~LK'*"Jie~~f43 676
`
`HUMl\N ~~FESLS~--SIRKQYISEESIUlGooSEVPGS IPN~IVYKALRSH:riJ<1l!!<lliTI'fuN ILQ=PDLYKVFER~
`YEAST ~cb;i<;igfLoc;Iut6Jlr!Lr!'JrpDMVPKVDTLD~EDEKAQFINRKEHI SSLr.f,HvLFPCI~JtiiLI4J--VVEPDLYK\fFE~
`
`756
`769
`
`NATURE · VOL 368 · 17 MARCH 1994
`
`© 1994 Nature Publishing Group
`
`259
`
`GDX 1017
`
`
`
`LETTERS TO NATURE
`
`FIG. 2 Human MLH1 localizes to 3p21.3-23 by
`fluorescence in situ hybridization. a, A meta(cid:173)
`phase spread showing hybridization of hMLH1 is
`presented. Biotinylated hMLH1 genomic probes
`were hybridized to banded human metaphase
`21
`chromosomes as previously described 20
`"
`.
`Detection was with fluorescein isothiocyanate
`(FITC)-conjugated avidin (green signal); chromo(cid:173)
`somes, shown in blue, were counterstained with
`4',6-diamino-2-phenylindole (DAPI). Images were
`obtained with a cooled CCD camera, enhanced,
`pseudocoloured and merged with the following
`programs: CCD Image Capture; NIH Image 1.4;
`Adobe Photoshop and Genejoin Maxpix, respec(cid:173)
`tively. b, Composite of chromosome 3 from
`multiple metaphase spreads aligned with a
`ideogram. Region of
`human chromosome 3
`hybridization (distal portion of 3p21.3-3p23) is
`indicated in the ideogram by a vertical bar.
`
`FIG. 3 Detection of hMLH1 mutations in an HNPCC family showing
`linkage to chromosome 3p. a, Sequence chromatograms showing
`identification of a C to T substitution that produces a non-conservative
`amino-acid substitution at position 44 of the hMLH1 protein. DNA
`samples obtained from the blood of individuals of interest were used
`as templates in PCR with primers (5'-GAAAGGTCCTGACTCTICC-3' and
`5'-ATGTACATIAGAGTAGTIGC-3') that amplify the exon encoding amino
`acids 40-69. The PCR products were prepared, purified and analysed
`using an Applied Biosystems 377 sequencer as previously described 1
`.
`Sequence analysis of one unaffected (top, plus and minus strands) and
`one affected individual (lower, plus and minus strands) is presented.
`The position of the heterozygous nucleotide is indicated by an arrow.
`Analysis of the sequence chromatographs indicated that there is
`sufficient T signal in the C peak and enough A signal in the G peak for the
`affected individuals to be heterozygous at this site. Analysis of linkage of
`the C to T substitution in family 2 used standard methods'- Gene fre(cid:173)
`quency is estimated at 0.001. Allele frequencies are assumed to be
`equal. All individuals with the mutation (all 4 affecteds, and 4 unaffec(cid:173)
`teds) share the same haplotype with markers on 3p21.3 (D3S643/
`D3S1478, D3S1298, D3S1029, Not73, D3S1100, D3S966, D3S1007).
`The markers have been mapped to this chromo-
`somal region with human-chromosome-3-mouse
`hybrids and haplotype analysis in two chromo(cid:173)
`some 3-linked families (unpublished data). The
`mean age of onset of colon cancer in family 2
`(50 yr) is higher than in family 1 and other HNPCC
`26
`families 25
`. b, Amino-acid sequence alignments
`"
`of the highly conserved region of the MLH family
`of proteins surrounding the site of the predicted
`amino-acid substitution. Bold
`type
`indicates
`position of the predicted serine to phenylalanine
`amino-acid substitution in affected individuals,
`and the serine or alanine residues conserved at
`this position in Mutl-like proteins. Dots indicate positions of highest
`amino-acid conservation. For the mouse MLH1 protein the dots indicate
`that the sequence has not been obtained. Sequences were aligned as
`described for the phylogenetic tree in Fig. 1. For determination of the
`likely secondary structures for the normal primary sequence versus the
`
`mouse MLH1
`S. cerevisiae MLH 1
`S. cerevisiae PMS1
`E. coli Mutl
`S. typhimurium Mutl
`S. pneumoniae HexB
`
`b
`human MLH1 affected
`human MLH1
`normal
`
`260
`
`R RRR T C C RC R R
`
`TTTT R oil T il T T
`
`VNR I AAGEV I QRP P.NA 1 KEMI ENC:LDAKFTS I QVI V :<ECGLKL I Q I QDNGTG IRK S~)LD I \JC ER
`\TNRIAAGEVIQRPANAIKEMIENCLDAKSTSIQVIVKEGGLKLTQIQDNGTGIRKEDLD=VC::>::R
`
`.......... PANAIKEMIENCLDAKSTNIQVVVKEGG~KLIQIQDNGTGIRKEDLD=VCER
`\!NKIAAGEJIISPVNALKEMMENSIDANATMIDILVKEGGIKVLQITDNGSGINKADLPILCER
`VHRITSGQVITDLTTAVKELVDNSIDANANQIEIIFKD'!GLESIECSDNGDGIDPS:-..JYEFLALK
`ANQ!AAGE\11/ERPAS\!VKELVENSLDAGATRIDIDIERGGA!<SIR.IRDNGCG"l:KK:JELALALAR
`ANQIAAGEWERPASWKELVENSLDAGATRVDiuiERGGAKLIRIRDNGCGIKKEELALALAH
`ANQIAAGEVIERPASVCKELVENAIDAGSSQIIIEIEEAGLKKVQITDNGHGIAHDEVELALRR
`
`altered sequence in affected individuals from family 2, we used the
`Protean secondary structure module in the DNAStar program (DNASTAR
`Inc.). The Chou-Fasman parameters were set at 103-alpha region
`threshold and 105-beta region threshold. Garnier-Robson calculated
`alpha and beta decision constants were both set to 0.
`
`NATURE · VOL 368 · 17 MARCH 1994
`
`© 1994 Nature Publishing Group
`
`GDX 1017
`
`
`
`We suggest that the observed C toT substitution in the coding
`region of hM LH 1 represents the mutation that is the basis for
`HNPCC in family 23
`. DNA sequence and ASO analyses did not
`detect the C to T substitution in 74 unrelated individuals. Thus
`the C to T substitution is not simply a polymorphism. The
`observed C to T substitution is expected to produce a serine to
`phenylalanine change at position 44 (Fig. 3). This amino-acid
`substitution is a non-conservative change in a conserved region
`of the protein (Figs I and 3 ). Secondary structure predictions
`using Chou-Fasman parameters (Fig. 3 legend) suggest a helix(cid:173)
`turn-beta-sheet structure with position 44 located in the turn.
`The observed Ser to Phe substitution at position 44 lowers the
`prediction for this turn considerably, suggesting that the pre(cid:173)
`dicted amino-acid substitution alters the conformation of the
`hMLHI protein. Therefore, we propose that hMLH1 represents
`a second DNA mismatch repair gene that is involved in HNPCC.
`At present, we have no direct evidence that the hM LH 1 gene
`is involved in the correction of DNA mispairs. In bacteria and
`yeast, mutations affecting DNA mismatch repair cause increases
`in the rate of spontaneous mutation, including additions and
`16
`13
`15
`24
`1
`5
`deletions within dinucleotide repeats 4
`. In humans,
`1.
`"
`'
`'
`'
`'
`mutation of hMSH2 is the basis of chromosome-2 HNPCCu,
`tumours of which show microsatellite instability and an apparent
`defect in mismatch repair 12
`. Chromosome 3-linked HNPCC is
`also associated with instability of dinucleotide repeats 3
`. Com(cid:173)
`bined with these observations, the high degree of conservation
`between the human MLHl protein and the yeast DNA mismatch
`repair protein MLHl, suggests that hMLH 1 is likely to function
`in DNA mismatch repair. During the isolation of the hM LH 1
`gene, a second MLH gene, which does not map to chromosome
`3, was identified and is predicted to encode a protein with strong
`similarity to yeast PMSl (our unpublished observations). This
`suggests that mammalian DNA mismatch repair, like that in
`yeast4
`, may require at least two MutL-like proteins.
`We have described here a second DNA mismatch repair gene
`homologue, hM LH 1, which is likely to be the hereditary non(cid:173)
`polyposis colon cancer gene localized to chromosome 3p21-23 3
`.
`26
`Like other HNPCC families 25
`, chromosome 3p-families show
`'
`apparent predisposition to several types of cancers3
`. The avail(cid:173)
`ability of the hMLH1 and hMSH2 gene sequences will aid the
`screening of HNPCC families for mutations in either gene. In
`addition, although loss of heterozygosity (LOH) of linked mark(cid:173)
`ers is not a feature of either the 2p or 3p forms of HNPCC3.6,
`LOH involving the 3p21.3-23 region has been observed in several
`human cancers27 29
`. Although speculative, this raises the possi(cid:173)
`bility that hM LH 1 mutation may play some role in these
`tumours. Finally, defects in additional DNA mismatch repair
`genes may be the basis for the cancer in HNPCC families without
`chromosome 2p or 3p involvement3
`0
`.
`
`Received 8 February; accepted 28 February 1994.
`
`1. Fishel, R. eta/. Cell 75, 1027-1038 (1993).
`2. Leach, F. eta/. Cell 75, 1215-1225 (1993).
`3. L\ndblom, A., Tannergiud, P., Werelius, B. & Nordenskj01d, M. Nature Genet. 5, 279-282
`(1993).
`4. Prolla, T. A., Chnstie, D.·M. & Liskay, R. M. Malec. cell. Bioi. 14, 407-415 (1994).
`5. Strand, M., Prolla, T. A., Uskay, R. M. & Petes. T. D. Nature 365, 27 4-276 (1993).
`6. Aaltonen. L A. et a/. Science 260, 812-816 (1993).
`7. lonov, Y., Peinado, M. A., Malkhosyan, S., Shibata, D. & Perucho, M. Nature 363, 558-
`561 (1993).
`8. Han. H.·J., Yanagisawa, A .. Kato. Y., Park, J.-G. & Nakamura, Y. Cancer 53, 5087-5089
`(1993).
`9. Risinger, J. I. eta/. Cancer 53, 5100-5103 (1993).
`10. Thibodeau. S. N., Bren, G. & Shaid, D. Science 260, 816-819 (1993).
`11. Levinson. G. & Gutman. G. A. Nucleic Acids Res. 15, 5323-5338 (1987).
`12. Parsons, R. eta/. Ce//75, 1227-1236 (1993).
`13. Modrich, P. A. Rev. Genet. 25, 229-253 (1991).
`14. Reenan. R. A. & Kolodner. R. D. Genetics 132, 963-973 (1992).
`15. Williamson, M. S., Game. J. C. & Fogel, S. Genetics 110, 609-646 (1985).
`16. Kramer, W., Kramer. B .. Williamson. M. S. & Fogel. S. J. Bact. 171, 5339-5346 (1989).
`17. Bishop, D. K .. Anderson. J. & Kolodner, R. D. Proc. natn. Acad. Sci. U.S.A. 86,3713-3717
`(1989).
`18. Mankovich, J. A., Mcintyre. C. A. & Walker, G. C. J. Bact. 171, 5325-5331 (1989).
`19. Prudhomme, M., Martin, B .. Mejean. V. & Claverys, J. J. Bact. 171, 5332-5338 (1989).
`20. Lichter, P. eta/. Science 247, 64-69 (1990).
`
`NATURE · VOL 368 · 17 MARCH 1994
`
`LETTERS TO NATURE
`
`21. Boyle, A., Feltquite, D. M., Dracopoli, N., Housman, D. & Ward, D. C. Genomics 12, 106-
`115 (1992).
`22. Lyon. M. F. & Kirby, M. C. Mouse Genome 91, 40-80 (1993).
`23. Wu. D. Y., Nozari, G., Schold, M., Conner, B. J. & Wallace. R. B. DNA 8, 135-142 (1989).
`24. Reenan. R. A. & Kolodner, R. D. Genetics 132, 975-985 (1992).
`25. Bishop, T. D. & Thomas, H. Cancer Sur. 9, 585-604 (1990).
`26. Lynch, H. T. eta/. Gastroenterology 104, 1535-1549 (1993).
`27. Latif, F. eta/. Cancer Res. 52, 1451-1456 (1992).
`28. Naylor. S. L, Johnson, B. E., Minna, J.D. & Sakaguchi, A. Y. Nature 329, 451-454 (1987).
`29. Ali. I. U., Lidereau, R. & Callahan, R. J. nat. Cancer lnst. 81, 1815-1820 (1989).
`30. Higgins. D., Bleasby, A. & Fuchs, R. Camp. Apple Biosci. 8, 189-191 (1992).
`
`ACKNOWLEDGEMENTS. We thank A. Harris. D.·M. Christie, M. Robatzek, K. F1sh and T. Desai
`for technical assistance, J. Garber, F. Li and S. Verselis for control DNA samples, M. Litt and
`M. Forte for the NIGMS2 panel DNA, S. Friend for technical advice and T. Pro I Ia for his comments
`on the manuscript. C. E. B. was a recipient of an American Cancer Society fellowship and S.M.B.
`received a fellowship from the C. G. Swebilius Fund. This work was supported by NIH research
`grants to R.M.L., R.D.K., R.F. and D.C.W., a Cancer Center Core Grant to the D.F.C.I. from the
`NIH, and grants from Bert von Kantzow's Fund and the Stockholm Cancer Society. C. E. B. and
`S.M. B. contributed equally to this work.
`
`Destabilization of the
`complete protein secondary
`structure on binding
`to the chaperone GroEL
`
`Ralph Zahn*, Claus Spitzfadent, Marcel Ottigert,
`Kurt Wiithricht & Andreas Pliickthun*t
`
`* Max-Pianck-lnstitut fur Biochemie, Protein Engineering Group,
`Am Klopferspitz, D-82152 Martinsried, Germany
`t lnstitut fur Molekularbiologie und Biophysik, ETH-H6nggerberg,
`CH-8093 Zurich, Switzerland
`
`PROTEIN folding in vivo is mediated by helper proteins, the molecu(cid:173)
`3
`lar chaperones1
`, of which Hsp60 and its Escherichia coli variant
`-
`GroEL are some of the best characterized. GroEL is an oligomeric
`protein with 14 subunits each of M, 60K4-<>, which possesses weak,
`co-operative ATPase activity7
`9 and high plasticity10
`• GroEL
`-
`seems to interact with non-native proteins, binding one or two
`molecules per 14-mer11
`19 in a 'central cavity'20
`, but little is known
`-
`about the conformational state of the bound polypeptides. Here
`we use nuclear magnetic resonance techniques to show that the
`interaction of the small protein cyclophilin21
`22 with GroEL is
`'
`reversible by temperature changes, and all amide protons in
`GroEL-bound cyclophilin are exchanged with the solvent, although
`this exchange does not occur in free cyclophilin. The complete
`secondary structure of cyclophilin must be disrupted when bound
`to GroEL.
`Exchange of cyclophilin amide protons (Fig. I) in the presence
`of GroEL was studied under conditions in which cyclophilin
`could bind to GroEL, that is, at 30 cc and pH 6.0 (see below).
`An equimolar mixture of 15N-labelled cyclophilin and GroEL
`was heated in D 20 to 30 cc for 8 h to induce cyclophilin binding,
`followed by cooling to 6 T for 14 h, to induce protein-chap(cid:173)
`crone dissociation (Fig. 2a). This cycle was repeated three times
`to ensure that most cyclophilin molecules were bound at least
`once even if the turnover of bound cyclophilin was slow (at
`pH 6.0 and 30 C, only 50%, of cyclophilin is bound to GroEL
`at equilibrium). Before the nuclear magnetic resonance (NMR)
`experiments, cyclophilin and GroEL were separated by cation(cid:173)
`exchange chromatography in D 20 at 6 cc. The resulting cyclo(cid:173)
`philin fractions were pooled and concentrated to a protein con(cid:173)
`centration of 0.4 mM. A two-dimensional (' 5N, 1H]-correlation
`15N, 1H]-COSY) spectrum of this solution was completely
`([
`empty (not shown), demonstrating that all amide protons had
`been exchanged with deuterium.
`To show that the cyclophilin recovered in D 20 from the
`GroEL-cyclophilin complex is in the native folded form, it was
`
`t To whom correspondence should be addressed at: Biochem. lnstitut, Universitat ZOrich,
`Winterthurerstr. 190, CH-8057 ZUrich, Switzerland.
`
`261
`
`© 1994 Nature Publishing Group
`
`GDX 1017
`
`