nih-gov/www.nlm.nih.gov/pubs/techbull/ma03/ma03_protein.html
2025-02-26 13:17:41 -05:00

232 lines
12 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" >
<html lang="en">
<head>
<!--***********************change issue date and title below**********************-->
<title>Implementation of New Guidelines for the Structure and Nomenclature of Protein Concepts in MeSH. NLM Technical Bulletin. 2003 Mar-Apr</title>
<meta name="DC.Subject.IssueNum" content="331" />
<meta name="DC.Subject.IssueCover" content="/pubs/techbull/ma03/ma03_issue_cover.html" />
<meta name="DC.Subject.Keyword" content="Medical Subject Headings" />
<meta name="DC.Subject.Keyword" content="MEDLINE" />
<meta name="DC.Subject.Keyword" content="PubMed" />
<meta name="DC.Subject.Keyword" content="Release" />
<meta name="DC.Subject.Keyword" content="Substance Name" />
<meta name="DC.Subject.Keyword" content="Supplementary Concept Records" />
<meta name="DC.Subject.Keyword" content="Year-End Processing" />
</head>
<body link="#476B47" alink="#476B47" vlink="#476B47" text="#000000" bgcolor="ffffff"><noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-MT6MLL" height="0" width="0" style="display:none;visibility:hidden" title="googletagmanager"></iframe></noscript><script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='//www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-MT6MLL');</script>
<style type="text/css">
#skip, .skip, .skipnavigation {
position:absolute;
left:0px;
top:-500px;
width:1px;
height:1px;
overflow:hidden;
}
</style>
<div class="skipnavigation"><a title="Skip the navigation on this page" href="#skipnav" class="skipnavigation">Skip Navigation Bar</a></div>
<center>
<a id="skipnav" name="skipnav"></a>
<table border="0" width="550">
<tr><td width="550" align="center" colspan="3">
<img src="/pubs/techbull/new_tb_graphics/header_final.gif" border="0" alt="NLM Technical Bulletin Header"/>
<br />
<img src="/pubs/techbull/new_tb_graphics/toc_331.gif" border="0" alt="Article Navigation Bar" usemap="#subheader_issues"/>
<map name="subheader_issues" id="subheader_issues">
<area shape="RECT" coords="30,20,165,40" href="/pubs/techbull/ma03/ma03_issue_cover.html" alt="Table of Contents">
<area shape="RECT" coords="215,20,265,40" href="/pubs/techbull/tb.html" alt="NLM Technical Bulletin Home Page">
<area shape="RECT" coords="310,20,395,40" href="/pubs/techbull/back_issues.html" alt="Back Issues">
</map>
</td></tr>
<!--************************indicate date posted below**********************************-->
<!--************************add dates corrected to subsequent lines with [corrected] tag***************************-->
<tr>
<td width="30">&nbsp;</td>
<td width="470"><font size="2"><strong>April 23, 2003</strong> [posted]</font><br /></td>
<td width="30">&nbsp;</td>
</tr>
<tr><td width="30">&nbsp;</td>
<td width="470">&nbsp;</td>
</tr>
<!--************************indicate title of article below*********************-->
<tr>
<td width="30">&nbsp;</td>
<td width="470"><font size="5"><strong>Implementation of New Guidelines for the
Structure and Nomenclature of Protein Concepts in MeSH</strong></font><br /></td>
<td width="30">&nbsp;</td>
</tr>
<tr>
<td width="30">&nbsp;</td>
<td width="470">
<br />
<p>
<!--************************change graphic to match first letter of article*************************-->
<img src="/pubs/techbull/new_tb_graphics/o.gif" border="0" align="left" alt="drop cap letter for o"/>
ver the past 23 years individual proteins appearing in the literature were indexed with use of supplemental concept records (SCRs).
During this period the process of new protein SCR creation was linked to the first appearance of protein sequence data in an article
cited in MEDLINE. The recent increase in published sequence data and the concurrent use of short acronym names for proteins has
resulted in the need to revise the MeSH protein thesaurus and develop a new system to accurately index and retrieve protein-related information.
</p>
<p>
Under the new guidelines individual proteins are represented by SCRs, while descriptor records represent protein classes. Individual
proteins are defined in MeSH as a unique protein from a single species. Protein subunits, alternative mRNA splice variants and
polymorphic variants of the same protein may be included as subordinate concepts within the same record. To avoid confusing
proteins with similar or even identical names, the name of the protein is followed by the organism name from which it is derived. The
preferred name of each protein is the approved name found in curated genome databases followed by the organism name. The
curated genome databases have rooted-out duplicated and obsolete protein names. The use of a curated protein name followed by the
organism name for each protein results in the creation of highly specific and non-duplicated protein terminology. Specific examples of
re-formatted protein names are shown below.
</p>
<p>
<strong>Examples of Organism-Specific Proteins</strong><br /><br />
Bzz1 protein, <strong>S cerevisiae</strong><br />
Cdc24 protein, <strong>S pombe</strong><br />
CYK-3 protein, <strong>C elegans</strong><br />
PagP protein,<strong> E coli</strong><br />
RPC5 protein, <strong>human</strong><br />
St7r protein, <strong>mouse</strong><br />
TRIPTYCHON protein, <strong>Arabidopsis</strong><br />
</p>
<p>
To effectively handle the vast numbers of new, organism-specific, proteins discussed in the literature, the creation of new SCR
protein records is limited to proteins from specific model organisms and to proteins of special biomedical importance such as proteins
directly involved in pathogenesis and those used as therapeutic agents or as diagnostic reagents. All other proteins will be represented
by coordination of a MeSH protein class descriptor and a MeSH organism descriptor. The list of MeSH model organisms includes: human,
mouse, rat, Drosophila, Xenopus, <em>S cerevisiae</em>, <em>S pombe</em>, <em>E coli</em>,
Zebrafish, <em>C elegans</em> and Arabidopsis. In the future additional model organism categories may be added to the protein SCRs.
These categories will be initially represented by the existing organism-specific SCRs that were previously considered
biomedically important and could be supplemented by the creation of new SCRs from protein information derived from
authoritative sources.
</p>
<p>
The existing supplemental concept records for proteins are being revised to conform to these new guidelines. Current SCRs that represent
individual proteins are being reformatted to include organism-specific protein terms and official preferred terms from curated databases.
Non-specific terms are being removed. SCRs that represent a class of proteins are being promoted to MeSH descriptors, while specific
proteins found in the record are being demoted to organism-specific SCRs. In addition current SCRs that represent multiple proteins or the
same protein from multiple organisms will be broken into individual protein SCRs.
</p>
<p>
In each case, appropriate maintenance will be performed on MEDLINE citations. Some of this maintenance was done for the 2003 system as part of year-end processing,
some may occur throughout calendar year 2003, and the majority should be accomplished with the successful completion of year-end processing for the 2004 system.
Some situations are straightforward and the old SCR in a MEDLINE Name of Substance element is simply replaced by the new form of the name. Other situations are
complex and involve breaking up a single SCR that previously referred to multiple proteins into individual, organism-specific SCRs. These require search strategies against
PubMed to isolate the citations that need maintenance to one or more of the new SCRs. For example, last year the SCR of transcription factor TFIIA was promoted to a
MeSH Heading and two new SCRs were created: TOA1 protein, S cerevisiae and TOA2 protein, S cerevisiae. During year-end processing, these two searches were run against PubMed:
</p>
<ol>
<li>TOA1 [tw] AND transcription factor TFIIA [nm] AND medline [sb]</li>
<li>TOA2 [tw] AND transcription factor TFIIA [nm] AND medline [sb]</li>
</ol>
<p>
The new SCR of TOA1 protein, S cerevisiae was added as a new name of substance to the citations that were found by Search 1 while TOA2 protein, S cerevisiae
was added as a new name of substance to the citations that were found by Search 2.
</p>
<p>
Thus far we have identified and revised over 10,000 organism-specific protein SCRs, of which 5,800 are from model organisms.
Based upon the current rate of editing 70% of existing protein SCRs will be completed by the release of 2004 MeSH. Upon completion
of this project we anticipate having approximately 30,000 organism-specific proteins represented as individual protein SCRs in MeSH.
</p>
<br /><br />
<p>
<!--************************indicate article author and section below*******************************-->
<strong>By James M. Pash, Ph.D.</strong><br />
<strong>MeSH Section</strong><br />
</p>
<!--************************indicate indexing terms below**********************-->
<!--
<p><strong>Indexing Terms</strong></p>
<p>MeSH, New Guidelines for the Structure and Nomenclature of Protein Concepts</p>
<p>Proteins, New Guidelines for the MeSH Structure and Nomenclature</p>
<p>Supplemental Concept Records (SCR), New Guidelines for the Structure and Nomenclature</p>
-->
<!--************************end indexing terms*********************************-->
<img src="/pubs/techbull/new_tb_graphics/black_pixel.gif" width="450" height="1" alt="black line separting article from citation"/>
<p>
<!--************************change citation information below*****************************-->
<em>Pash JM. Implementation of New Guidelines for
the Structure and Nomenclature of Protein Concepts in MeSH. NLM Tech Bull. 2003 Mar-Apr;(331):e10.</em>
<!--************************end citation information********************************-->
</p>
</td><td width="30">&nbsp;</td></tr>
</table>
<br />
<br />
<img src="/pubs/techbull/new_tb_graphics/footer_331.gif" border="0" alt="Article Navigation Bar" usemap="#footer_final"/>
<map name="footer_final" id="footer_final">
<area shape="RECT" coords="265,5,315,30" href="/pubs/techbull/tb.html" alt="NLM Technical Bulletin Home Page">
<area shape="RECT" coords="330,5,420,30" href="/pubs/techbull/back_issues.html" alt="Back Issues">
<area shape="RECT" coords="435,5,480,30" href="/pubs/techbull/new_index.html" alt="Index">
<area shape="RECT" coords="5,5,90,20" href="/pubs/techbull/ma03/ma03_nci.html" alt="Previous Page">
<area shape="RECT" coords="445,6,504,20" href="/pubs/techbull/ma03/ma03_cancer_subset.html" alt="Next Article">
</map>
</center>
<!-- BEGIN NLM FOOTER -->
<center>
<font size="2" face="helvetica, arial"><a href="/nlmhome.html">U.S. National Library of Medicine</a>, 8600 Rockville Pike, Bethesda, MD 20894<br /><a href="http://www.nih.gov/">National Institutes of Health</a>, <a href="//www.hhs.gov/">Department of Health &amp; Human Services</a><br /><a href="/copyright.html">Copyright</a>, <a href="/privacy.html">Privacy</a>, <a href="/accessibility.html">Accessibility</a>, <a href="http://www.nih.gov/icd/od/foia/index.htm">Freedom of Information Act (FOIA)</a><br/><a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html">HHS Vulnerability Disclosure</a>
<br />
<!-- ******************MODIFY "LAST UPDATED" ******************* -->
Last updated: 11 April 2012
</font>
</center>
<!-- END NLM FOOTER -->
<!-- ******************MODIFY EXPDATE AND EMAIL BELOW****************** -->
<!-- EXPDATE="2015-03-20" -->
<!-- EMAIL="nlmtechbull@mail.nlm.nih.gov" -->
<script src="//assets.nlm.nih.gov/jquery/jquery-latest.min.js"></script><script src="/core/nlm-notifyExternal/1.0/nlm-notifyExternal.min.js"></script></body>
</html>