throbber
REVIEW
`
`Thioesterases: A new perspective based
`on their primary and tertiary structures
`
`David C. Cantu, Yingfei Chen, and Peter J. Reilly*
`
`Department of Chemical and Biological Engineering, Iowa State University, Ames, Iowa 50011
`
`Received 19 April 2010; Accepted 7 May 2010
`DOI: 10.1002/pro.417
`Published online 17 May 2010 proteinscience.org
`
`Abstract: Thioesterases (TEs) are classified into EC 3.1.2.1 through EC 3.1.2.27 based on their
`activities on different substrates, with many remaining unclassified (EC 3.1.2.–). Analysis of primary
`and tertiary structures of known TEs casts a new light on this enzyme group. We used strong
`primary sequence conservation based on experimentally proved proteins as the main criterion,
`followed by verification with tertiary structure superpositions, mechanisms, and catalytic residue
`positions, to accurately define TE families. At present, TEs fall into 23 families almost completely
`unrelated to each other by primary structure. It is assumed that all members of the same family
`have essentially the same tertiary structure; however, TEs in different families can have markedly
`different folds and mechanisms. Conversely, the latter sometimes have very similar tertiary
`structures and catalytic mechanisms despite being only slightly or not at all related by primary
`structure, indicating that they have common distant ancestors and can be grouped into clans. At
`present, four clans encompass 12 TE families. The new constantly updated ThYme (Thioester-
`active enzYmes) database contains TE primary and tertiary structures, classified into families and
`clans that are different from those currently found in the literature or in other databases. We
`review all types of TEs, including those cleaving CoA, ACP, glutathione, and other protein
`molecules, and we discuss their structures, functions, and mechanisms.
`
`Keywords: clan; primary structure; protein family; tertiary structure; thioesterases; ThYme
`
`Introduction
`The thioesterases (TEs), or thioester hydrolases,
`comprise a large enzyme group whose members hy-
`drolyze the thioester bond between a carbonyl group
`and a sulfur atom. They are classified by the No-
`
`Additional Supporting Information may be found in the online
`version of this article.
`
`Grant sponsor: U.S. National Science Foundation; Grant
`number: EEC-0813570.
`*Correspondence to: Peter J. Reilly, Department of Chemical
`and Biological Engineering, 2114 Sweeney Hall,
`Iowa State
`University, Ames, IA 50011-2230. E-mail: reilly@iastate.edu.
`
`menclature Committee of the International Union of
`Biochemistry and Molecular Biology (NC-IUBMB)
`into EC (enzyme commission) 3.1.2.1 to EC 3.1.2.27,
`as well as EC 3.1.2.– for unclassified TEs.1 Sub-
`strates of 15 of these 27 groupings contain coenzyme
`A (CoA), two contain acyl carrier proteins (ACPs),
`four have glutathione or its derivatives, one has
`ubiquitin, and two contain other moieties. In addi-
`tion, three groupings have been deleted.
`The EC classification system is based on enzyme
`function and substrate identity, and it was first for-
`mulated when very few amino acid sequences (pri-
`mary structures) and three-dimensional
`(tertiary)
`
`Published by Wiley-Blackwell. VC 2010 The Protein Society
`
`PROTEIN SCIENCE 2010 VOL 19:1281—1295
`
`1281
`
`Exhibit 2076
`Page 01 of 15
`
`

`

`structures of enzymes were available. Another way
`to classify enzymes is by primary structure into fam-
`ilies and by tertiary structure into clans or superfa-
`milies. Some databases are built this way: Pfam2
`has a collection of protein families and domains, and
`SCOP3 classifies protein structures into classes,
`folds, families, and superfamilies. Other databases
`treat certain enzyme groups more specifically. For
`instance, MEROPS4 is a major database for pepti-
`and CAZy5
`dases,
`covers
`carbohydrate-active
`enzymes.
`It is common to observe that members of more
`than one EC grouping are found in one enzyme fam-
`ily based on similar amino acid sequences, implying
`that they have a common ancestor, mechanism, and
`tertiary structure. Conversely, members of a single
`EC grouping may be located in more than one
`enzyme family, being totally or almost totally unre-
`lated in primary structure and potentially in mecha-
`nism and tertiary structure.
`A further observation is that members of two
`different enzyme families may have very similar ter-
`tiary structures and mechanisms even though their
`primary structures are very different. This may
`imply that they are members of the same clan or
`superfamily, descended from a more distant common
`ancestor.
`In this work, TE primary and tertiary struc-
`tures will be analyzed to conclude how TEs are di-
`vided (and united) into families and clans. Struc-
`tures, mechanisms, and catalytic
`residues are
`compared between families and clans. We compare
`our findings with existing databases such as Pfam
`and SCOP. Results also appear in a new continu-
`ously updated database, ThYme (Thioester-active
`enzYmes, http://www.enzyme.cbirc.iastate.edu) that
`includes families and clans of enzyme groups that
`are part of the fatty acid synthesis cycle, TEs among
`them.
`
`Family identification
`Family members must have strong (>15%, but typi-
`cally >30%) sequence similarity and near-identical
`tertiary structures, and they must share general
`mechanisms as well as catalytic residues located in
`the same position.
`In general, TE families were identified in the
`following way:
`(1) experimentally confirmed TE
`sequences were used as queries, (2) a series of suc-
`cessive Basic Local Alignment Search Tool (BLAST)6
`searches and comparison among results reduced
`query sequences to a few representative ones, (3) the
`catalytic domains of representative query sequences
`were subjected to BLAST to populate the families,
`(4) experimentally confirmed TEs were surveyed to
`search for missing potential TE families, and (5) the
`uniqueness of the families was confirmed by multi-
`ple sequence alignments (MSAs), by tertiary struc-
`
`ture superposition and comparison, and by catalytic
`residue positions. Methods are detailed in Support-
`ing Information.
`
`Clan identification
`Two or more families are grouped into a clan if all
`the sequences within them show some (<15%)
`sequence similarity, if their structures are strongly
`similar (narrowing the search to families with the
`same fold), and if they share similar active sites and
`general mechanisms. To consider all aspects of clan
`classification criteria, several methods are used to
`combine sequence and structural analysis. In addi-
`tion, catalytic mechanisms of members of each fam-
`ily were gathered from the literature, and positions
`of catalytic residues were determined to verify that
`they coincided. A more detailed description of these
`methods is found in the Supporting Information.
`
`ThYme database
`All the sequences in each family are displayed on
`the ThYme database website (http://www.enzyme.
`cbirc.iastate.edu). These sequences are taken, using
`a series of scripts, from the BLAST results of the
`catalytic domains
`of
`the
`representative query
`sequences. Matching accessions, taxonomical data,
`protein names, and EC numbers are taken from
`UniProt7 and GenBank8 databases. Each TE family
`is shown on a page where sequences are arranged
`into archaea, bacteria, and eukaryota, then alpha-
`betically by species. In each row, a single sequence
`or group of sequences with 100% identical catalytic
`domains are shown with their protein name and
`UniProt and/or GenBank accession codes. EC num-
`bers are shown only when they appear in a sequen-
`ce’s UniProt or GenBank annotation. If a crystal
`structure is known, the Protein Data Bank (PDB,
`http://www.rcsb.org) accession code also appears.
`ThYme will be continuously updated: the content of
`each family will grow as GenBank, UniProt, and
`PDB do. However, to create a new family, or to
`merge or delete existing ones, human judgment and
`manual changes will be necessary.
`
`Thioesterase families
`Use of BLAST with TE query sequences followed by
`construction of MSAs and superposition of tertiary
`structures yielded 23 families almost completely
`unrelated by primary structure (Table I).
`Enzymes in families TE1–TE13 hydrolyze sub-
`strates with various acyl moieties and CoA, those
`in TE14–TE19 attack bonds between acyl groups
`and ACP, and those in TE20 and TE21 cleave the
`bonds between acyl groups and proteins. Members
`of TE22 and TE23 break bonds between acyl groups
`and glutathione and its derivatives (Table II). The
`sulfur-carrying moiety in CoA and ACP is a pante-
`thiene residue, whereas glutathione itself carries
`
`1282
`
`PROTEINSCIENCE.ORG
`
`Thioesterases: A New Perspective
`
`Exhibit 2076
`Page 02 of 15
`
`

`

`Table I. Thioesterase Families and Common Names of their Members
`
`Family
`
`Producing organisms
`
`Genes and/or other names of family members
`
`TE1
`TE2
`TE3
`TE4
`TE5
`TE6
`TE7
`TE8
`TE9
`TE10
`TE11
`TE12
`TE13
`TE14
`TE15
`TE16
`
`TE17
`TE18
`TE19
`TE20
`TE21
`TE22
`TE23
`
`A, B, Ea
`A, B, E
`A, B
`B, E
`B
`A, B, E
`B, E
`A, B, E
`B
`B
`B
`B,E
`A, B
`B, E
`B
`A, B, E
`
`B
`B,E
`B
`E
`A, B, E
`A, B, E
`A, B, E
`
`Ach1
`Acot1–Acot6, BAAT thioesterase
`tesA, acyl-CoA thioesterase I, protease I, lysophospholipase L1
`tesB, acyl-CoA thioesterase II, Acot8
`tesC (ybaW), acyl-CoA thioesterase III
`Acot7 (BACH), Acot11 (BFIT, Them1), Acot12 (CACH), YciA
`Acot9, Acot10
`Acot13 (Them2)
`YbgC
`4HBT-I
`4HBT-II, EntH (YbdB)
`DNHA-CoA hydrolase
`paaI, paaD
`FatA, FatB
`Thioesterase CalE7
`TE domain of FAS (Thioesterase I), TE domain of
`PKS or NRP (type I thioesterase (TE I))
`TE domain of PKS
`Thioesterase II, type II thioesterase (TE II)
`luxD
`ppt1, ppt2, palmitoyl-protein thioesterase
`apt1, apt2, acyl-protein thioesterase, phospholipase, carboxylesterase
`S-formylglutathione hydrolase, esterase A, esterase D
`Hydroxyglutathione hydrolase, glyoxalase II
`
`a A, archaea; B, bacteria; E, eukaryota. Most prevalent producers bolded.
`
`the sulfur moiety, and in non-ACP proteins, the sul-
`fur-carrying moiety is built up mainly from a cyste-
`ine residue.
`
`All tertiary structures within each family have
`almost identical cores and very strong overall resem-
`blance (Table III) shown by RMSDave values of <1.8
`
`Table II. Thioesterase Functions and Substrate Specificities
`
`Family
`
`General function
`
`EC number
`
`Preferred substrate specificity (if known)
`
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`Acyl-CoA hydrolase
`
`TE1
`TE2
`TE3
`TE4
`
`TE5
`TE6
`
`TE7
`TE8
`TE9
`
`TE10
`TE11
`TE12
`TE13
`
`TE14
`TE15
`TE16
`
`TE17
`TE18
`
`TE19
`TE20
`TE21
`TE22
`TE23
`
`Acyl-ACP hydrolase
`Acyl-ACP hydrolase
`Acyl-ACP hydrolase
`
`3.1.2.–, 3.1.2.14
`—
`3.1.2.14a
`
`Acyl-ACP hydrolase
`Acyl-ACP hydrolase
`
`3.1.2.14b
`3.1.2.–, 3.1.2.14
`
`Acyl-ACP hydrolase
`Protein-palmitoyl hydrolase
`Protein-acyl hydrolase
`Glutathione hydrolase
`Glutathione hydrolase
`
`2.3.1.–
`3.1.2.–, 3.1.2.22
`3.1.2.–, 3.1.1.1
`3.1.2.12, 3.1.1.1, 3.1.1.6
`3.1.2.6
`
`Acetyl-CoA
`3.1.2.1, 2.8.3.–
`Palmitoyl-CoA, bile-acid-CoA
`3.1.2.–, 3.1.2.2, 2.3.1.65
`3.1.2.–, 3.1.2.20, 3.1.1.2, 3.1.1.5 Medium- to long-chain acyl-CoA
`3.1.2.–, 3.1.2.2, 3.1.2.27
`Short- to long-chain acyl-CoA, palmitoyl-CoA,
`choloyl-CoA
`Long-chain acyl-CoA, 3,5-tetradecadienoyl-CoA
`Short- to long-chain acyl-CoA, C4–C18
`
`3.1.2.–
`3.1.2.–, 3.1.2.1, 3.1.2.2, 3.1.2.18,
`3.1.2.19, 3.1.2.20
`3.1.2.–, 3.1.2.1, 3.1.2.2, 3.1.2.20
`3.1.2.–
`3.1.2.–, 3.1.2.18
`
`3.1.2.23
`3.1.2.–
`3.1.2.–
`3.1.2.–
`
`Short- to long-chain acyl-CoA
`Short- to long-chain acyl-CoA, C6–C18
`Short- to long-chain acyl-CoA,
`4-hydroxybenzoyl-CoA
`4-Hydroxybenzoyl-CoA
`4-Hydroxybenzoyl-CoA
`1,4-Dihydroxy-2-napthoyl-CoA
`Short and medium-chain acyl-CoA, several
`hydroxyphenylacetyl-CoA substrates
`Short- to long-chain acyl-ACP, C8–C18
`—
`Long-chain acyl-ACP, various polyketides and
`non-ribosomal peptides
`Several polyketides
`Medium-chain acyl-ACP, various polyketides
`and nonribosomal peptides
`Myristoyl-ACP
`Palmitoyl-protein
`—
`S-Formylglutathione
`D-Lactoylglutathione
`
`a TE domain. FASs, PKSs, and NRPs can have several EC numbers such as 2.3.1.85, 2.3.1.94, 2.3.1.–, 2.7.7.–, and 5.1.1.–.
`b TE domain of PKSs.
`
`Cantu et al.
`
`PROTEIN SCIENCE VOL 19:1281—1295
`
`1283
`
`Exhibit 2076
`Page 03 of 15
`
`

`

`Table III. Thioesterase Folds
`RMSDave (A˚ )
`
`Family
`
`Fold
`
`TE1
`TE2
`TE3
`TE4
`TE5
`TE6
`TE7
`TE8
`TE9
`TE10
`TE11
`TE12
`TE13
`TE14
`TE15
`TE16
`TE17
`TE18
`TE19
`TE20
`TE21
`TE22
`TE23
`
`NagB
`a/b-Hydrolase
`Flavodoxin-like
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`HotDog
`a/b-Hydrolase
`a/b-Hydrolase
`a/b-Hydrolase
`a/b-Hydrolase
`a/b-Hydrolase
`a/b-Hydrolase
`a/b-Hydrolase
`Lactamase
`
`1.25
`1.00
`0.58
`0.90
`—
`1.39
`—
`0.58
`1.19
`0.67
`0.87
`—
`0.43
`1.65
`—
`1.51
`1.67
`0.83
`—
`1.41
`0.82
`1.69
`1.67
`
`Pave (%)
`
`PDB files
`
`96.4
`96.6
`96.6
`33.3
`—
`75.9
`—
`88.3
`88.8
`97.1
`93.9
`—
`94.6
`87.7
`—
`66.9
`82.4
`97.2
`—
`91.2
`96.7
`78.9
`78.5
`
`2G39, 2NVV
`3HLK, 3K2I
`1IVN, 1J00, 1JRL, 1U8U, 1V2G, 3HP4
`1C8U, 1TBU
`1NJK
`3B7K, 2Q2B, 2V1O, 2QQ2, 1YLI, 3BJK, 3D6L
`
`2H4U, 3F5O, 2F0X, 2CY9
`2PZH, 1S5U, 3HM0, 1Z54
`1BVQ, 1LO7, 1LO8, 1LO9
`1Q4S, 1Q4T, 1Q4U, 1VH9, 2B6E, 1SC0, 3LZ7
`2VEU
`2FS2, 1PSU, 2DSL0, 1J1Y, 1WLU, 1WLV, 1WM6, 1WN3
`2OWN, 2ESS
`2W3X
`2VZ8,a 2VZ9,a 2PX6, 1XKT, 2ROQ,b 2CB9, 2CBG, 2VSQ, 1JMK
`1MO2, 1KEZ, 1MN6, 2H7X, 2H7Y, 2HFK, 2HFJ, 1MNA, 1MNQ
`3FLA, 3FLB, 2RON,b 2K2Qb
`1THT
`1EH5, 3GRO, 1EI9, 1EXW, 1PJA
`1FJ2, 1AUO, 1AUR, 3CN7, 3CN9
`3FCX, 3C6B, 2UZ0, 1PV1, 3I6Y, 3E4D, 3LS2
`2QED, 1XM8, 2P18, 2GCU, 2Q42, 1QH3, 1QH5, 2P1E
`
`a2VZ8 and 2VZ9 have TE domains in their FASTA format. Therefore, these were picked up by BLAST, but their PDB files
`do not include the TE domain, and they were not included in the RMSD calculation.
`bNMR-resolved structures not included in RMSD calculation.
`
`A˚ and Pave values of >75% (see Supporting Informa-
`tion for definitions), with two exceptions. TE4 has a
`Pave value of 33.3% because it has only two crystal
`structures, of which one monomer (1C8U) is a dou-
`ble HotDog, whereas another monomer (1TBU) is
`incomplete with only a single HotDog. Similarly, in
`TE16 the Pave value is 65.8% because the TE domain
`of one structure (2VSQ) is smaller than the rest.
`Of the families whose members hydrolyze acyl-
`CoAs, all have HotDog9,10 folds (Table III, Figs. 1
`and 2) except for TE1, TE2, and TE3. TE1 enzymes
`have NagB folds, and they have acetyl-CoA hydro-
`lase (EC 3.1.2.1) activity as well as acetate or succi-
`nate-CoA transferase (EC 2.8.3.–) activity. They are
`found mainly in bacteria and fungi, although they
`are also present in archaea. Enzymes coded by the
`acetyl-CoA hydrolase ACH1 gene from Saccharomy-
`ces cerevisiae are present in TE1.11 Fungal enzymes
`in this family are involved with acetate levels and
`CoA transfer in mitochondria.12
`TE2 enzymes have a/b-hydrolase13 folds (Figs. 3
`and 4). They are mainly found in eukaryotes (ani-
`mals), but they are also present in bacteria. They
`have mostly palmitoyl (EC 3.1.2.2) and bile acid-
`CoA:amino acid N-acyl
`transferase
`(BAT)
`(EC
`2.3.1.65) activities. The acyl-CoA TE (Acot) enzymes
`ACOT1, ACOT2, ACOT4, and ACOT6 from Homo
`sapiens are present in this family, as well as the
`Acot1 through Acot6 enzymes from Mus musculus,
`Rattus norvegicus, and similar species.14 Also in
`TE2 are the BAAT TEs that transfer bile acid from
`
`bile acid-CoA to amino acids in the liver; these con-
`jugates later solvate fatty acids in the gastrointesti-
`nal tract.15
`Enzymes in TE3 are part of the SGNH hydro-
`lase superfamily with a flavodoxin-like fold. They
`are mainly found in bacteria and have acyl-CoA hy-
`drolase (EC 3.1.2.20), arylesterase (EC 3.1.1.2), and
`lysophospholipase (EC 3.1.1.5) activities. Some TE3
`enzymes come from the tesA gene, and they are
`located in the periplasm and are involved in fatty
`acid synthesis.16 TE3 enzymes are also called acyl-
`CoA thioesterase I, protease I, and lysophospholi-
`pase L1, and the genes that code for them, tesA,
`apeA, and pldC, respectively, are nearly identical.17
`The rest of the acyl-CoA hydrolase families have
`HotDog folds. TE4 enzymes, present in bacteria and
`eukaryotes, are acyl-CoA hydrolases as well as palmi-
`toyl-CoA (EC 3.1.2.2) and choloyl-CoA (EC 3.1.2.27)
`hydrolases. The Acot8 gene encodes for peroxisomal
`TEs,18 which are found in TE4. Also in this family
`are acyl-CoA thioesterase II enzymes, encoded by the
`tesB gene, that can hydrolyze a broad range of me-
`dium- to long-chain acyl-CoA thioesters, but whose
`physiological function is not known.19
`TE5 acyl-CoA enzymes, also known as thioester-
`ase IIIs, are present in bacteria. They are encoded
`by the tesC (or ybaW) gene and are long-chain acyl-
`CoA TEs preferring 3,5-tetradecadienoyl-CoA as a
`substrate.20
`TE6 members, present in eukaryotes, bacteria,
`and archaea, have acyl-CoA hydrolase activities
`
`1284
`
`PROTEINSCIENCE.ORG
`
`Thioesterases: A New Perspective
`
`Exhibit 2076
`Page 04 of 15
`
`

`

`Figure 1. Superimposed tertiary structures of single representatives of each TE family in a clan: TE-A acyl-CoA hydrolases
`from Escherichia coli (TE5) (green), Helicobacter pylori (TE9) (red), Pseudomonas sp. (TE10) (yellow), and Prochlorococcus
`marinus (TE12) (blue).
`
`Figure 2. Superimposed tertiary structures of single representatives of each TE family in a clan: TE-B acyl-CoA hydrolases
`from Homo sapiens (TE8) (blue), Arthrobacter sp. (TE11) (red), and E. coli (TE13) (yellow).
`
`Cantu et al.
`
`PROTEIN SCIENCE VOL 19:1281—1295
`
`1285
`
`Exhibit 2076
`Page 05 of 15
`
`

`

`Figure 3. Superimposed tertiary structures of single representatives of each TE family in a clan: TE-C acyl-ACP hydrolases
`from Homo sapiens (TE16) (blue), Saccharopolyspora erythraea (TE17) (red), and Amycolatopsis mediterranei (TE18) (yellow).
`
`with various specificities. Acot enzymes 7, 11, and
`12, present in eukaryotes, are found in TE6. Acot7
`enzymes (also known as BACH: brain acyl-CoA hy-
`drolases) are expressed mainly in brain tissue and
`preferentially attack C8–C18 acyl-CoA chains.21
`Acot11 (also known as BFIT: brown fat inducible TE,
`or Them1: TE superfamily member 1) enzymes are
`specific toward medium- and long-chain acyl-CoA
`molecules, and they may be involved with obesity in
`humans.22 Acot12 (also known as CACH: cytoplas-
`mic acyl-CoA hydrolase) enzymes in humans hydro-
`
`lyze acetyl-CoA.23 Many bacterial TE6 sequences are
`YciA TEs that hydrolyze a wide range of acyl-CoA
`thioesters and may help to form membranes.24 They
`preferentially attack butyryl, hexanoyl, lauroyl, and
`palmitoyl-CoA substrates.25
`TE7 enzymes are acyl-CoA TEs found in eukar-
`yota and bacteria. In this family are the Acot9 and
`Acot10 enzymes (previously known as MT-ACT48),
`which are expressed in the mitochondria and have
`short- to long-chain acyl-CoA TE activity, showing
`preference for C14 chains.26
`
`Figure 4. Superimposed tertiary structures of single representatives of each TE family in a clan: TE-D protein-acyl hydrolases
`from Bos taurus (TE20) (blue) and Homo sapiens (TE21) (yellow).
`
`1286
`
`PROTEINSCIENCE.ORG
`
`Thioesterases: A New Perspective
`
`Exhibit 2076
`Page 06 of 15
`
`

`

`Most TE8 members, mainly present in eukar-
`yota but also in bacteria, are acyl-CoA thioesterase
`13 (Acot13) enzymes, also known as TE superfamily
`member 2 (Them2). Enzymes in this family hydro-
`lyze short-to-long acyl-CoA (C4–C18) chains, prefer-
`ring the latter.27
`TE9 members are found only in bacteria, and
`they have acyl-CoA hydrolase activity, mostly
`unclassified (3.1.2.–), but ADP-dependent
`short-
`chain acyl-CoA hydrolases (EC 3.1.2.18), and 4-
`hydroxybenzoyl-CoA hydrolases (EC 3.1.2.23) are
`also found. The YbgC TEs are found in this family;
`some hydrolyze primarily short-chain acyl-CoA thio-
`esters,28 whereas others prefer long-chain acyl-CoA
`thioesters.29 Also, the TE domain of methylketone
`synthase, MKS2, recently discovered in tomato, is
`found in TE9.30
`The enzymes in TE10 and TE11 are found only
`in bacteria, and most have 4-hydroxybenzoyl-CoA
`hydrolase (EC 3.1.2.23) activity. They, along with
`other
`enzymes,
`convert 4-chlorobenzoate
`to 4-
`hydroxybenzoate in soil-dwelling bacteria.31 Also in
`TE11 are the EntH (YbdB) TEs, involved with enter-
`obactin (an iron chelator) biosynthesis in Esche-
`richia coli.32 This is a unique example of a HotDog-
`fold enzyme
`involved in nonribosomal peptide
`biosynthesis.
`Most TE12 enzymes are 1,4-dihydroxy-2-nap-
`thoyl (DNHA)-CoA hydrolases, involved in vitamin
`K1 biosynthesis,33 and they are found mostly in bac-
`teria. TE13 enzymes occur in archaea and bacteria.
`Most are either PaaI or PaaD enzymes in the phe-
`nylacetic acid degradation pathway, and they are
`part of the paa gene cluster.34
`TE14–TE19 enzymes hydrolyze acyl-ACP thio-
`esters, with those in TE14 and TE15 having HotDog
`folds, whereas the rest have a/b-hydrolase folds.
`TE14 enzymes are found in bacteria and plants;
`they have acyl-ACP hydrolase (EC 3.1.2.14) activity.
`Many plant enzymes in this family have been exper-
`imentally characterized: they contain FatA and FatB
`enzymes and can hydrolyze C8–C18 acyl-ACP thioest-
`ers.35 All TE14 bacterial sequences come from
`genomic or structural genomic studies.
`TE15 is a small family whose enzymes are pres-
`ent mainly in bacteria. Among them is the TE
`CalE7 involved with enediyne biosynthesis. After
`substrate-ACP hydrolysis, these enzymes decarbox-
`ylate the product before release.36 Enzymes in this
`family are the few TEs with HotDog domains
`involved with polyketide biosynthesis.
`TE16 enzymes occur in both eukaryotes and
`bacteria, and they have oleoyl-ACP hydrolase (EC
`3.1.2.14) activity. They include the TE domains of
`fatty acid synthases (FASs), also known as Thioes-
`terase I, that terminate fatty acid synthesis,37 and
`the TE domain of polyketide synthases (PKSs) and
`nonribosomal peptide synthases (NRPs), also known
`
`as Type I thioesterases (TE I), that terminate poly-
`ketide biosynthesis,38 or nonribosomal peptide bio-
`synthesis.39 In the case of NRPs, instead of an ACP
`as the carrier molecule, a polypeptide carrier protein
`(PCP) is used.
`TE17 enzymes are only found in bacteria,
`mainly in Streptomyces. They are the TE domains of
`various PKSs. FASs, PKSs, and NRPs are large mul-
`timodular enzymes with many domains having dif-
`ferent functions. Only the TE domains were used to
`identify these family members.
`Enzymes in TE18 are present in eukaryotes and
`bacteria and mainly have oleoyl-ACP hydrolase (EC
`3.1.2.14) activity. Some enzymes in this family are
`S-acyl
`fatty acid synthetases/thioester hydrolases
`(Thioesterase II).40 They work with FASs to produce
`medium-chain (C8–C12) fatty acids in milk.41 The
`Type II thioesterases (TE IIs) are found in TE18;
`these enzymes play an important role in polyketide
`and nonribosomal peptide biosynthesis by removing
`aberrant acyl chains from multimodular polyketide
`synthases and nonribosomal peptide synthases.42,43
`TE18 enzymes are independent TEs, not integrated
`to the multimodular FASs, PKSs, or NRPs.
`TE19 enzymes are classified as acyltransferases
`(EC 2.3.1.–), but they hydrolyze acyl-ACP molecules,
`mainly myristoyl-ACP.44 These enzymes divert fatty
`acids to the luminescent system in certain bacteria.
`TE20 members, found only in eukaryotes, are
`palmitoyl-protein TEs (EC 3.1.2.22) encoded by PPT
`genes. They hydrolyze the thioester bond between a
`palmitoyl group and a cysteine residue in proteins.45
`Mutations in PPT enzymes have been linked to neu-
`ronal ceroid lipofuscinosis, a genetic neurodegenera-
`tive disorder.46
`TE21 enzymes were originally identified as lyso-
`phospholipases,47 but
`they are also acyl-protein
`APT1 TEs.48 They hydrolyze
`thioester
`bonds
`between acyl chains and cysteine residues on pro-
`teins. Many proteins in this family also have carbox-
`yesterase (EC 3.1.1.1) activity.
`Among TE22 enzymes are S-formylglutathione
`hydrolases (EC 3.1.2.12) catalyzing formaldehyde
`detoxification;
`they hydrolyze S-formylglutathione
`into formate and glutathione.49 Also in TE22 are
`enzymes with acetyl esterase (EC 3.1.1.6) and car-
`boxyesterase (EC 3.1.1.1) activity.
`TE23 members are hydroxyglutathione hydro-
`lases (EC 3.1.2.6), also known as glyoxalase II
`enzymes, that hydrolyze S-D-lactoyl-glutathione to
`glutathione and lactic acid in methylglyoxal detoxifi-
`cation.50 TE23 enzymes occur in archaea, bacteria,
`and eukaryotes and have a metallo-b-lactamase fold.51
`
`Correspondence to EC groupings
`These TE families bear rather limited resemblance
`to EC numbers representing TEs. For instance, ace-
`tyl-CoA hydrolases (3.1.2.1) occur in TE1, TE6, and
`
`Cantu et al.
`
`PROTEIN SCIENCE VOL 19:1281—1295
`
`1287
`
`Exhibit 2076
`Page 07 of 15
`
`

`

`Table IV. Thioesterase Core Secondary Structure
`Elements
`
`Clan
`
`Family
`
`Secondary structural element
`
`HotDog
`TE-A
`TE-A
`TE-A
`TE-A
`TE-B
`TE-B
`TE-B
`—
`—
`—
`—
`a/b-Hydrolase
`TE-C
`TE-C
`TE-C
`TE-D
`TE-D
`—
`—
`—
`
`TE5
`TE9
`TE10
`TE12
`TE8
`TE11
`TE13
`TE4
`TE6
`TE14
`TE15
`
`TE16
`TE17
`TE18
`TE20
`TE21
`TE2
`TE19
`TE22
`
`a a, a-helix; b, b-strand.
`
`b-a-b-b-b-ba
`a-b-b-b-b
`b-a-b-b-b-b
`b-a-b-b-b-b
`b-b-a-b-b-b-b
`b-b-a-b-b-b-b
`b-b-a-b-b-b-b
`a-b-b-b-b-b-b-a-b-b-b-b
`b-a-b-b-b-b
`b-a-b-b-b-b-b-a-b-b-b-b
`b-a-b-b-b-b
`
`b-a-b-a-b-a-b-b-a-b-a
`b-a-b-a-b-a-b-b-a-b-a
`b-b-a-b-a-b-a-b-b-a-b-a
`b-a-b-a-a-b-a-b-a-b-a-b-a
`b-a-b-b-a-b-a-b-b-a-b-a
`b-b-b-a-b-a-b-a-b-b-a-b-a
`b-b-b-a-b-a-b-a-b-a-b-a-b-a-a
`b-b-b-b-a-b-a-b-a-b-a-b-a-b-a
`
`TE7; palmitoyl-CoA hydrolases (EC 3.1.2.2) are
`found in TE2, TE4, TE6, and TE7; oleoyl-ACP hy-
`drolases (EC 3.1.2.14) occur in TE14 and TE16–
`TE18, and acyl-CoA hydrolases (EC 3.1.2.20) are
`found in TE3, TE6, and TE7. Conversely, of the 24
`EC numbers remaining after three deletions, only 11
`of
`them (EC 3.1.2.1, 3.1.2.2, 3.1.2.6, 3.1.2.12,
`3.1.2.14,
`3.2.1.18,
`3.1.2.19,
`3.1.2.20,
`3.1.2.22,
`3.1.2.23, and 3.1.2.27, along with unclassified TEs
`(EC 3.1.2.–)) occur in significant numbers among the
`23 TE families. Of course, further EC numbers char-
`acteristic of TEs will likely appear as more TEs are
`sequenced and characterized.
`
`Other thioesterases
`Ubiquitin carboxyl-terminal hydrolases (EC 3.1.2.15)
`cleave a wide variety of products from the C-termi-
`nal glycine residue of ubiquitin. They were first
`identified as thiolesterases because they cleave dithi-
`othreitol from ubiquitin, and they were thought to
`also hydrolyze ubiquitin-glutathione and other ubiq-
`uitin thiolesters.52 It was later shown that they hy-
`drolyze amides and other groups from ubiquitin.53
`These enzymes belong to a larger class of peptidases
`called deubiquitinating enzymes that hydrolyze ly-
`sine-glycine amide bonds in ubiquitinated proteins.54
`
`Several families of these enzymes can be found in
`MEROPS, the peptidase database. We identified 11
`ubiquitin thiolesterase families by the methods
`described above, but we have not included them
`here or in the ThYme database, as peptidase activity
`is their main function, and they can be found in
`MEROPS.
`Certain acyl transferases (EC 2.3.1.–), for exam-
`ple, 2.3.1.9, 2.3.1.16, 2.3.1.38, and 2.3.1.39 among
`others, can hydrolyze acyl-CoA or acyl-ACP sub-
`strates and later join the liberated acyl group to
`another acyl-CoA or acyl-ACP molecule. Although
`they hydrolyze thioesters, this is not their main
`function, and therefore, we also decided not
`to
`include these enzymes here.
`
`Thioesterase clans
`TE families 4–6 and 8–15, all with members having
`HotDog crystal structures, were subjected to the
`methods described above and two clans were found:
`TE-A comprising families TE5, TE9, TE10, and
`TE12; and TE-B with TE8, TE11, and TE13.
`PSI–BLAST6 analysis suggested that TE5, TE9,
`TE10, and TE12 should be grouped into one clan
`and TE8, TE11, and TE13 into another, because
`slight sequence similarities among these families
`were found. Secondary structure element analysis of
`the structures pointed to TE5, TE6, TE10, TE12,
`and TE15 (having five b-strands) being placed in one
`clan and TE8, TE11, and TE13 (having six b-
`strands) being placed in another (Table IV); visual
`inspection suggested the same two groupings, with
`the first also including TE9. All crystal structures in
`candidate families of both possible clans were tested
`with superpositions and RMSD analysis (Figs. 1
`and 2, Table V). These different tests led to the two
`clans being defined. Members of TE-A are all acyl-
`CoA hydrolases active on many substrates including
`short, long, branched, and aromatic acyl chains. Cat-
`alytic residues (see below) in TE6 are placed differ-
`ently than those of other TE-A families, and there-
`fore, TE6 was not
`included in this clan. The
`different substrate specificities, catalytic residues,
`and mechanism (see below) of TE15 members sug-
`gested that it also be excluded from TE-A. TE-B
`enzymes are also acyl-CoA hydrolases, except for the
`YbdB TEs in TE11 involved with enterobactin bio-
`synthesis. TE4, TE7 (which has no known tertiary
`structure), and TE14 enzymes are sufficiently differ-
`ent from members of TE-A and TE-B that they were
`
`Table V. RMSD Analysis of TE Clan Members
`RMSDmin (A˚ )
`RMSDave (A˚ )
`
`Clan
`
`RMSDmax (A˚ )
`
`Pmin (%)
`
`Pave (%)
`
`Pmax (%)
`
`Cutoff (A˚ )
`
`TE-A
`TE-B
`TE-C
`TE-D
`
`1.14
`0.11
`1.81
`0.44
`
`1.33
`0.97
`1.94
`1.45
`
`1.53
`2.02
`2.13
`2.00
`
`77.5
`72.3
`52.6
`67.0
`
`87.1
`86.8
`58.3
`80.9
`
`90.9
`100.0
`75.2
`100.0
`
`3.81
`3.80
`3.82
`3.79
`
`1288
`
`PROTEINSCIENCE.ORG
`
`Thioesterases: A New Perspective
`
`Exhibit 2076
`Page 08 of 15
`
`

`

`not considered for placement in either clan; the first
`2 are acyl-CoA hydrolases, whereas the third is an
`acyl-ACP hydrolase.
`TE families 2 and 16 through 22, whose mem-
`bers all have a/b-hydrolase crystal structures, belong
`to two clans: TE-C comprising TE16, TE17, and
`TE18, and TE-D with TE20 and TE21.
`Both sequence analysis and secondary structure
`element arrangement suggested only one clan of
`TE16, TE17, and TE18 (Table IV). Visual inspection
`suggested the two clans described above, and they
`were confirmed by superpositions, RMSD analysis,
`and the position of catalytic residues (Figs. 3 and 4,
`Table V). Families in TE-C contain acyl-ACP hydro-
`lases present
`in multidomain FASs, PKSs, and
`NRPs, as well as
`independent acyl-ACP TEs
`involved in those pathways. TE-D enzymes hydro-
`lyze palmitoyl and other acyl groups from protein
`surfaces. TE2, an acyl-CoA hydrolase, TE19, a myr-
`istoyl-ACP hydrolase, and TE22, active on glutathi-
`one-activated molecules, are not part of either clan.
`
`TE tertiary structures, catalytic residues, and
`mechanisms
`Catalytic mechanisms and residues of each TE fam-
`ily were gathered from crystal structure articles.
`The PDB files, proposed catalytic residues, and pro-
`ducing organisms of the relevant TEs are listed in
`Table VI.
`HotDog-fold enzymes lack defined nonsolvated
`binding pockets and conserved catalytic residues,24
`thus a variety of catalytic residues and mechanisms
`exist.
`In TE-A, only TE9 and TE10 can be further an-
`alyzed, as TE5 and TE12 at present have only one
`crystal structure each with no corresponding refer-
`eed article. In TE9, the YbgC structure 2PZH is a
`tetramer of two dimers. After comparing this struc-
`ture to 1LO9 in TE10 and other YbgC crystals, the
`authors proposed that His18, Tyr7, and Asp11 play
`important roles in catalysis.29
`TE10 4-hydroxybenzoyl-CoA TEs have homote-
`trameric quaternary structures. It was suggested
`from structures 1LO7, 1LO8, and 1LO9 that hydro-
`gen bonds and the positive end of a helix dipole
`moment make the thioester carbonyl group more
`susceptible to a nucleophilic attack by Asp17
`through an acyl-enzyme intermediate.55
`TE-B families include TE8, TE11, and TE13.
`Members of TE8 are tetramers composed of two Hot-
`Dog dimers. Based on a crystal structure of a human
`Them2 enzyme (3F5O), it was proposed that Gly57
`and Asn50 bind and polarize the thioester carbonyl
`group, whereas Asp65 and Ser85 orient and activate
`the water nucleophile.56
`In TE11, Arthrobacter sp. strain SU 4-hydroxy-
`benzoyl-CoA TE crystal structures reveal a tetra-
`meric enzyme with a dimer of dimers. Structures
`
`1Q4S, 1Q4T, and 1Q4U led to the proposal that
`Gly65 polarizes the carbonyl group for a nucleophilic
`attack carried out by Glu73.57
`Both TE10 and TE11 are 4-hydroxybenzoyl-CoA
`TEs of similar substrate specificities and metabolic
`functions; however, their tertiary and quaternary
`structures are different and they use different active-
`site regions and residues for catalysis. This supports
`placing these two families in two different clans.
`TE13 PaaI TE from Thermus thermophilus HB8
`yielded
`homotetrameric
`quaternary
`structures
`1WLU, 1J1Y, 1WM6, 1WLV, and 1WN3. From those
`structures, a study proposed that these enzymes use
`an induced-fit mechanism to hydrolyze the substrate
`via an Asp48-activated water nucleophile.58 Compar-
`ison of the structure of another PaaI, from E. coli
`(2FS2) with the Arthrobacter TE11 structures, as
`well as site-directed mutagenesis, pointed to a mech-
`anism similar to that in TE11: Gly53 prepares the
`thioester for a nucleophilic attack from Asp61.59 4-
`Hydrozybenzoyl-CoA enzymes from TE11 and the
`PaaI enzymes from TE13 catalyze two different reac-
`tions in different organisms, and

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket