Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 24;12(7):963.
doi: 10.3390/genes12070963.

Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea

Affiliations

Open Issues for Protein Function Assignment in Haloferax volcanii and Other Halophilic Archaea

Friedhelm Pfeiffer et al. Genes (Basel). .

Abstract

Background: Annotation ambiguities and annotation errors are a general challenge in genomics. While a reliable protein function assignment can be obtained by experimental characterization, this is expensive and time-consuming, and the number of such Gold Standard Proteins (GSP) with experimental support remains very low compared to proteins annotated by sequence homology, usually through automated pipelines. Even a GSP may give a misleading assignment when used as a reference: the homolog may be close enough to support isofunctionality, but the substrate of the GSP is absent from the species being annotated. In such cases, the enzymes cannot be isofunctional. Here, we examined a variety of such issues in halophilic archaea (class Halobacteria), with a strong focus on the model haloarchaeon Haloferax volcanii.

Results: Annotated proteins of Hfx. volcanii were identified for which public databases tend to assign a function that is probably incorrect. In some cases, an alternative, probably correct, function can be predicted or inferred from the available evidence, but this has not been adopted by public databases because experimental validation is lacking. In other cases, a probably invalid specific function is predicted by homology, and while there is evidence that this assigned function is unlikely, the true function remains elusive. We listed 50 of those cases, each with detailed background information, so that a conclusion about the most likely biological function can be drawn. For reasons of brevity and comprehension, only the key aspects are listed in the main text, with detailed information being provided in a corresponding section of the Supplementary Materials.

Conclusions: Compiling, describing and summarizing these open annotation issues and functional predictions will benefit the scientific community in the general effort to improve the evaluation of protein function assignments and more thoroughly detail them. By highlighting the gaps and likely annotation errors currently in the databases, we hope this study will provide a framework for experimentalists to systematically confirm (or disprove) our function predictions or to uncover yet more unexpected functions.

Keywords: Gold Standard Protein; Haloferax volcanii; annotation error; genome annotation; haloarchaea.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Illustration of the haloarchaeal cobalamin and heme biosynthesis pathways and of the major cobalamin biosynthesis gene cluster. (A) Biosynthesis pathways. This illustration is based on the corresponding KEGG map 00860. Small circles represent pathway intermediates and have their names assigned. Pathway intermediates upstream of precorrin-2 are not displayed. The circle for sirohydrochlorin is highlighted in red, as this is the branchpoint for heme and cobalamin biosynthesis in haloarchaea. Enzymatic reactions are shown by arrows, the EC numbers being provided in rectangular boxes. Rectangles are colored when the enzyme has been reconstructed for haloarchaea (blue: heme biosynthesis; dark yellow: de novo cobalamin biosynthesis; light yellow: late cobaltochelatase, which may be a salvage reaction). Gene names in green are adopted from KEGG and represent those from bacterial model pathways. Consecutive arrowheads indicate reaction series that are not shown in detail for space reasons. Additionally, some enzymes of the heme biosynthesis pathway are omitted for space reasons. For enzymatic reactions that are considered to be open issues, Hfx. volcanii locus tags are provided. For two pathway gaps (white boxes in the cobalt-early pathway), the type of reaction is indicated (oxidoreductase and ~CH3, indicating a methylation reaction). The question mark after HVO_B0058 indicates that this protein, currently co-attributed to EC 2.1.1.272, is a candidate for the yet-unassigned EC 2.1.1.195 reaction. We note that haloarchaea might use a deviating biosynthesis pathway, e.g., by swapping the methylation and oxidoreductase reactions (not illustrated). (B) The major cobalamin cluster, encoded on megaplasmid pHV3. Arrows are used to indicate the coding strand and are roughly drawn to scale. If assigned, the gene name is provided in addition to the Hfx. volcanii locus tag. Locus tags in red indicate genes that are part of the cobalamin cluster.
Figure 2
Figure 2
The structure of the C1 coenzymes tetrahydrofolate and methanopterin and two enzymes that act on the attached C1 compound. (A) The structures of tetrahydromethanopterin (top) and tetrahydrofolate (bottom) illustrate the similarities and differences between these C1 coenzymes. The common pteridine-based ring system is highlighted in yellow, and the initial biosynthesis step that generates this ring system is catalyzed by homologous enzymes (topic (b)). Two methanopterin-specific methyl groups are outlined by dashed ovals. N5 and N10, which are involved in the binding of the C1 compound, are colored red. (B) Two enzymatic reactions that alter the oxidation level of the C1 compound are illustrated. The methanogenic and haloarchaeal enzymes are homologous, even though they use distinct C1 coenzymes (topic (c)). It should be noted that MTH-1752 uses coenzyme F420 (not illustrated, Section 3.4, topic (c)), and this might also hold true for HVO_1937.
Figure 3
Figure 3
Biosynthesis of polar lipids. A key intermediate is CDP-archaeol, which is generated from archaeol (displayed as fully saturated) by CarS. Members of the InterPro:IPR000462 family then transfer the CDP-archaeol to the hydroxyl group (alcohol group) of the target molecule (backbone: serine, glycerol and myo-inositol). Subsequent modifications contribute to the diversity of polar lipids.

Similar articles

Cited by

References

    1. Hartman A.L., Norais C., Badger J.H., Delmas S., Haldenby S., Madupu R., Robinson J., Khouri H., Ren Q., Lowe T.M., et al. The complete genome sequence of Haloferax volcanii DS2, a model archaeon. PLoS ONE. 2010;5:e9605. doi: 10.1371/journal.pone.0009605. - DOI - PMC - PubMed
    1. Schulze S., Adams Z., Cerletti M., De Castro R., Ferreira-Cerca S., Fufezan C., Gimenez M.I., Hippler M., Jevtic Z., Knuppel R., et al. The Archaeal Proteome Project advances knowledge about archaeal cell biology through comprehensive proteomics. Nat. Commun. 2020;11:3145. doi: 10.1038/s41467-020-16784-7. - DOI - PMC - PubMed
    1. Leigh J.A., Albers S.V., Atomi H., Allers T. Model organisms for genetics in the domain Archaea: Methanogens, halophiles, Thermococcales and Sulfolobales. FEMS Microbiol. Rev. 2011;35:577–608. doi: 10.1111/j.1574-6976.2011.00265.x. - DOI - PubMed
    1. Perez-Arnaiz P., Dattani A., Smith V., Allers T. Haloferax volcanii-a model archaeon for studying DNA replication and repair. Open Biol. 2020;10:200293. doi: 10.1098/rsob.200293. - DOI - PMC - PubMed
    1. Soppa J. Functional genomic and advanced genetic studies reveal novel insights into the metabolism, regulation, and biology of Haloferax volcanii. Archaea. 2011;2011:602408. doi: 10.1155/2011/602408. - DOI - PMC - PubMed

MeSH terms

Substances

LinkOut - more resources