Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug 1;4(8):e1000142.
doi: 10.1371/journal.pcbi.1000142.

Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies

Affiliations

Evolutionarily conserved substrate substructures for automated annotation of enzyme superfamilies

Ranyee A Chiang et al. PLoS Comput Biol. .

Abstract

The evolution of enzymes affects how well a species can adapt to new environmental conditions. During enzyme evolution, certain aspects of molecular function are conserved while other aspects can vary. Aspects of function that are more difficult to change or that need to be reused in multiple contexts are often conserved, while those that vary may indicate functions that are more easily changed or that are no longer required. In analogy to the study of conservation patterns in enzyme sequences and structures, we have examined the patterns of conservation and variation in enzyme function by analyzing graph isomorphisms among enzyme substrates of a large number of enzyme superfamilies. This systematic analysis of substrate substructures establishes the conservation patterns that typify individual superfamilies. Specifically, we determined the chemical substructures that are conserved among all known substrates of a superfamily and the substructures that are reacting in these substrates and then examined the relationship between the two. Across the 42 superfamilies that were analyzed, substantial variation was found in how much of the conserved substructure is reacting, suggesting that superfamilies may not be easily grouped into discrete and separable categories. Instead, our results suggest that many superfamilies may need to be treated individually for analyses of evolution, function prediction, and guiding enzyme engineering strategies. Annotating superfamilies with these conserved and reacting substructure patterns provides information that is orthogonal to information provided by studies of conservation in superfamily sequences and structures, thereby improving the precision with which we can predict the functions of enzymes of unknown function and direct studies in enzyme engineering. Because the method is automated, it is suitable for large-scale characterization and comparison of fundamental functional capabilities of both characterized and uncharacterized enzyme superfamilies.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Substructure definitions.
(A) The conserved substructure (c) (blue square) is the maximal set of bonds that are present in all the substrates of a superfamily and their adjacent atoms. (B) Reacting substructure (r) (red triangle) is calculated by finding the maximal set of bonds in a substrate that are not present in the product, their adjacent atoms, and the atoms that form new bonds in the product. (C) f c is the fraction of the conserved substructure (blue square) that is reacting (red triangle overlap) and is calculated as (r ∩ c)/c. (D) f r is the fraction of the reacting substructure (red triangle) that is conserved (blue square overlap) and is calculated as (r ∩ c)/r.
Figure 2
Figure 2. Summary of superfamilies and their conserved substrate substructures.
Because the portion of the conserved substructure that is reacting often varies among members within one superfamily, we do not highlight the reacting substructure in this figure. (See Figure 4 for plots of the distribution of this variation over all superfamilies and Table S2 for values of variation for each superfamily.)
Figure 3
Figure 3. Distribution of overlap between conserved and reacting substructures.
(A) Distribution of average fraction of conserved substructure that is reacting. For bonds (orange stripe) and for atoms (blue solid) (B) scatter plot of average f r versus f c. The average f c and average fr are calculated using atoms. Each superfamily is represented by a blue diamond. The plot is colored to orient the reader within the plot and to roughly indicate where the different overlap patterns fall. (I) Completely nonoverlapping (red), (II) partially overlapping (green), (III) completely overlapping (orange), (IV) reacting is part of conserved substructure (blue), (V) conserved is part of reacting substructure (purple). (C) Five types of overlap patterns. The conserved substructure (blue circle) can have the following overlap (purple) with the reacting substructure (red circle): (I) completely nonoverlapping, (II) partially overlapping, (III) completely overlapping, (IV) reacting is part of conserved, (V) conserved is part of reacting.
Figure 4
Figure 4. Variation in the overlap between the conserved substructure and reacting substructure.
(A) Variation in the fraction of the conserved substructure that is reacting. Distribution of the observed standard deviation in f c within each superfamily, for bonds (orange stripe) and atoms (blue solid). (B) Variation in which part of conserved substructure is reacting. Average pairwise overlap in the reacting and conserved substructure (or ∩ c), for bonds (orange stripe) and atoms (blue solid). In both plots, superfamilies with less variation can be found on the left side of the distributions and those with more variation are found on the right.
Figure 5
Figure 5. Protein structures with unknown function can be annotated with superfamily-conserved substructures.
This partial list includes superfamilies with between four and nine proteins of unknown function. See Table S3 for the full list.
Figure 6
Figure 6. Enzyme engineering strategy.
Two previously demonstrated examples using superfamily analysis to guide engineering of enzymes to perform new functions . In the top example, error-prone PCR resulted in a single point mutation of muconate lactonizing II (MLE) enzyme, which enabled it to catalyze the o-succinylbenzoate synthase (OSBS) reaction (k cat/K M (M−1 s−1) = 2×103). In the lower example, a single mutation was rationally designed based on comparison of the active sites of Ala-Glu epimerase (AEE) and o-succinyl benzoate synthase (OSBS). The mutant that was generated enabled this enzyme to catalyze the OSBS reaction as well (k cat/K M (M−1 s−1) = 12.5). In both of these examples, the superfamily conserved substrate substructure (blue) and associated partial reaction were not changed during the engineering experiment. The changes in the reaction that were made are in the portion of the substrates that are not conserved in the superfamily (black). The diverse products of the native MLE, OSBS, and AEE reactions are also shown (grey).

Similar articles

Cited by

References

    1. Babbitt PC, Gerlt JA. Understanding enzyme superfamilies. Chemistry as the fundamental determinant in the evolution of new catalytic activities. J Biol Chem. 1997;272:30591–30594. - PubMed
    1. Copley SD. Evolution of a metabolic pathway for degradation of a toxic xenobiotic: the patchwork approach. Trends Biochem Sci. 2000;25:261–265. - PubMed
    1. Aharoni A, Gaidukov L, Khersonsky O, McQ Gould S, Roodveldt C, et al. The ‘evolvability’ of promiscuous protein functions. Nat Genet. 2005;37:73–76. - PubMed
    1. Riesenfeld CS, Schloss PD, Handelsman J. Metagenomics: genomic analysis of microbial communities. Annu Rev Genet. 2004;38:525–552. - PubMed
    1. Frazer KA, Elnitski L, Church DM, Dubchak I, Hardison RC. Cross-species sequence comparisons: a review of methods and available resources. Genome Res. 2003;13:1–12. - PMC - PubMed

Publication types

MeSH terms