Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 1;81(23):9819-23.
doi: 10.1021/ac901335x.

Software tool for researching annotations of proteins: open-source protein annotation software with data visualization

Affiliations

Software tool for researching annotations of proteins: open-source protein annotation software with data visualization

Vivek N Bhatia et al. Anal Chem. .

Abstract

In order that biological meaning may be derived and testable hypotheses may be built from proteomics experiments, assignments of proteins identified by mass spectrometry or other techniques must be supplemented with additional notation, such as information on known protein functions, protein-protein interactions, or biological pathway associations. Collecting, organizing, and interpreting this data often requires the input of experts in the biological field of study, in addition to the time-consuming search for and compilation of information from online protein databases. Furthermore, visualizing this bulk of information can be challenging due to the limited availability of easy-to-use and freely available tools for this process. In response to these constraints, we have undertaken the design of software to automate annotation and visualization of proteomics data in order to accelerate the pace of research. Here we present the Software Tool for Researching Annotations of Proteins (STRAP), a user-friendly, open-source C# application. STRAP automatically obtains gene ontology (GO) terms associated with proteins in a proteomics results ID list using the freely accessible UniProtKB and EBI GOA databases. Summarized in an easy-to-navigate tabular format, STRAP results include meta-information on the protein in addition to complementary GO terminology. Additionally, this information can be edited by the user so that in-house expertise on particular proteins may be integrated into the larger data set. STRAP provides a sortable tabular view for all terms, as well as graphical representations of GO-term association data in pie charts (biological process, cellular component, and molecular function) and bar charts (cross comparison of sample sets) to aid in the interpretation of large data sets and differential analyses experiments. Furthermore, proteins of interest may be exported as a unique FASTA-formatted file to allow for customizable re-searching of mass spectrometry data, and gene names corresponding to the proteins in the lists may be encoded in the Gaggle microformat for further characterization, including pathway analysis. STRAP, a tutorial, and the C# source code are freely available from http://cpctools.sourceforge.net.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic representation of STRAP functionality, including data input and output. STRAP can read protein lists in UniProt entry or accession number format obtained from plain text files, as well as from Mascot and TPP ProteinProphet results files. STRAP then gathers protein GO-term annotation data from the public UniProtKB and the EBI GOA databases and allows editing of this data, providing the capacity to integrate in-house expertise on the proteins of study. STRAP can save the annotations to disk or export them to the Gaggle framework via Firegoose. 82×51mm (300 × 300 DPI)
Figure 2
Figure 2
GUI interface of STRAP showing the main protein annotation view. All columns in the main annotation table, including the GO category column allow sorting. This allows users to group proteins by any gene ontology. Annotation attributes can be edited to include or eliminate GO-term associations with particular protein entries according to the users’ expertise. 86×165mm (300 × 300 DPI)
Figure 3
Figure 3
STRAP’s built in Gene Ontology Term Browser. The browser presents all GO terms associated with a particular protein entry, as well as each GO term’s complete lineage. 82×28mm (300 × 300 DPI)
Figure 4
Figure 4
STRAP pie chart rendering. To graphically display the GO-term sub-categories for each data set, STRAP can generate pie charts for each of the three main GO categories, wherein each slice represents a sub-category. Each pie slice is labeled with the GO subcategory name, the number of GO annotations within the category, and the percentage fraction of annotations associated with that particular GO term. Shown is a Biological Process pie chart generated from the GO terms associated with a set of 10 proteins, using the example dataset as described within the text. Note that each unit of the pie represents one GO term rather than one protein, as one protein can be assigned multiple GO terms. 82×69mm (300 × 300 DPI)
Figure 5
Figure 5
STRAP bar chart rendering for the comparison of multiple data sets. STRAP can generate bar charts allowing for comparison and visualization of large datasets based upon GO terms. This bar graph compares the amount of Biological Process GO term annotations between three sets of proteins as described within the text. 82×58mm (300 × 300 DPI)

Similar articles

Cited by

References

    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. Nat Genet. 2000;25:25–29. - PMC - PubMed
    1. Falkner JA, Falkner JW, Andrews PC. Bioinformatics. 2006;22:632–633. - PubMed
    1. Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh L-SL. Nucl. Acids Res. 2004;32:D115–D119. - PMC - PubMed
    1. Reimand J, Kull M, Peterson H, Hansen J, Vilo J. Nucleic Acids Res. 2007;35:W193–W200. - PMC - PubMed
    1. Kaplan N, Vaaknin A, Linial M. Nucleic Acids Res. 2003;31:5617–5626. - PMC - PubMed

Publication types