Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 22;25(2):bbae050.
doi: 10.1093/bib/bbae050.

A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches

Affiliations

A comprehensive computational benchmark for evaluating deep learning-based protein function prediction approaches

Wenkang Wang et al. Brief Bioinform. .

Abstract

Proteins play an important role in life activities and are the basic units for performing functions. Accurately annotating functions to proteins is crucial for understanding the intricate mechanisms of life and developing effective treatments for complex diseases. Traditional biological experiments struggle to keep pace with the growing number of known proteins. With the development of high-throughput sequencing technology, a wide variety of biological data provides the possibility to accurately predict protein functions by computational methods. Consequently, many computational methods have been proposed. Due to the diversity of application scenarios, it is necessary to conduct a comprehensive evaluation of these computational methods to determine the suitability of each algorithm for specific cases. In this study, we present a comprehensive benchmark, BeProf, to process data and evaluate representative computational methods. We first collect the latest datasets and analyze the data characteristics. Then, we investigate and summarize 17 state-of-the-art computational methods. Finally, we propose a novel comprehensive evaluation metric, design eight application scenarios and evaluate the performance of existing methods on these scenarios. Based on the evaluation, we provide practical recommendations for different scenarios, enabling users to select the most suitable method for their specific needs. All of these servers can be obtained from https://csuligroup.com/BEPROF and https://github.com/CSUBioGroup/BEPROF.

Keywords: benchmark; deep learning; protein; protein function.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The main processes of existing approaches in AFP (left) and the loosely hierarchical structure of GO Terms (right). The structure of GO terms can be visualized as a graph, where each node represents a specific GO term, and the edges connecting the nodes represent the relationships between these terms. In AFP, two types of relationships (’is_a’ and ’part_of’) are used. These relationships establish a hierarchy resembling a ”parent-child” relationship, allowing the transfer of functions between related terms in a reliable manner.
Figure 2
Figure 2
Distribution of protein sequence lengths. Most of protein sequence lengths are less than 1000 (89%), while only 0.56% protein sequence lengths are larger than 3000.
Figure 3
Figure 3
Distribution of the number of known functions per protein in three ontologies (top) and distribution of the number of known proteins per GO term in three ontologies (bottom).
Figure 4
Figure 4
Distribution of IC values of GO terms in three ontologies (top) and the depths of GO terms in three ontologies (bottom).
Figure 5
Figure 5
Predictive performance comparison on difficult proteins (low similarity with training proteins) in terms of Fmax and Smin.
Figure 6
Figure 6
Predictive performance comparison on disorder proteins in terms of AUPR.
Figure 7
Figure 7
Predictive performance comparison on disorder proteins in terms of M-AUPR.
Figure 8
Figure 8
Predictive performance of existing methods on the functions of disorder regions in terms of AUPR. GO terms are sorted from top to bottom by their depths, from shallow to deep.
Figure 9
Figure 9
Generalizability of existing methods to HUMAN species. The blue part indicates that several HUMAN proteins are contained in the training set, while the yellow part indicates that these proteins are removed from the training set.
Figure 10
Figure 10
Generalizability of existing methods to MOUSE species. The blue part indicats that several MOUSE proteins are contained in the training set, while the yellow part indicates that these proteins are removed from the training set.
Figure 11
Figure 11
Distribution of AUPR on different GO terms grouped by different depths, as calculated by existing methods.
Figure 12
Figure 12
Performance comparison of AUPR on different GO terms grouped by different frequencies.
Figure 13
Figure 13
Distribution of AUPR on different GO terms grouped by different IC values.

Similar articles

Cited by

References

    1. Li M, Ni P, Chen X, et al. . Construction of refined protein interaction network for predicting essential proteins. IEEE/ACM Trans Comput Biol Bioinform 2017;16(4):1386–97. - PubMed
    1. Zeng M, Li M, Wu FX, et al. . DeepEP: a deep learning framework for identifying essential proteins. BMC Bioinform 2019;20:1–10. - PMC - PubMed
    1. Wang W, Meng X, Xiang J, et al. . CACO: a core-attachment method with cross-species functional ortholog information to detect human protein complexes. IEEE J Biomed Health Inform 2023;27:4569–78. - PubMed
    1. Uhlén M, Fagerberg L, Hallström BM, et al. . Tissue-based map of the human proteome. Science 2015;347(6220):1260419. - PubMed
    1. Lounkine E, Keiser MJ, Whitebread S, et al. . Large-scale prediction and testing of drug activity on side-effect targets. Nature 2012;486(7403):361–7. - PMC - PubMed

Publication types