232 lines
12 KiB
HTML
232 lines
12 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" >
|
|
<html lang="en">
|
|
<head>
|
|
<!--***********************change issue date and title below**********************-->
|
|
<title>Implementation of New Guidelines for the Structure and Nomenclature of Protein Concepts in MeSH. NLM Technical Bulletin. 2003 Mar-Apr</title>
|
|
<meta name="DC.Subject.IssueNum" content="331" />
|
|
<meta name="DC.Subject.IssueCover" content="/pubs/techbull/ma03/ma03_issue_cover.html" />
|
|
|
|
<meta name="DC.Subject.Keyword" content="Medical Subject Headings" />
|
|
<meta name="DC.Subject.Keyword" content="MEDLINE" />
|
|
<meta name="DC.Subject.Keyword" content="PubMed" />
|
|
<meta name="DC.Subject.Keyword" content="Release" />
|
|
<meta name="DC.Subject.Keyword" content="Substance Name" />
|
|
<meta name="DC.Subject.Keyword" content="Supplementary Concept Records" />
|
|
<meta name="DC.Subject.Keyword" content="Year-End Processing" />
|
|
</head>
|
|
|
|
<body link="#476B47" alink="#476B47" vlink="#476B47" text="#000000" bgcolor="ffffff"><noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-MT6MLL" height="0" width="0" style="display:none;visibility:hidden" title="googletagmanager"></iframe></noscript><script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='//www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-MT6MLL');</script>
|
|
<style type="text/css">
|
|
#skip, .skip, .skipnavigation {
|
|
position:absolute;
|
|
left:0px;
|
|
top:-500px;
|
|
width:1px;
|
|
height:1px;
|
|
overflow:hidden;
|
|
}
|
|
</style>
|
|
<div class="skipnavigation"><a title="Skip the navigation on this page" href="#skipnav" class="skipnavigation">Skip Navigation Bar</a></div>
|
|
|
|
|
|
<center>
|
|
<a id="skipnav" name="skipnav"></a>
|
|
<table border="0" width="550">
|
|
|
|
<tr><td width="550" align="center" colspan="3">
|
|
<img src="/pubs/techbull/new_tb_graphics/header_final.gif" border="0" alt="NLM Technical Bulletin Header"/>
|
|
<br />
|
|
<img src="/pubs/techbull/new_tb_graphics/toc_331.gif" border="0" alt="Article Navigation Bar" usemap="#subheader_issues"/>
|
|
|
|
<map name="subheader_issues" id="subheader_issues">
|
|
<area shape="RECT" coords="30,20,165,40" href="/pubs/techbull/ma03/ma03_issue_cover.html" alt="Table of Contents">
|
|
<area shape="RECT" coords="215,20,265,40" href="/pubs/techbull/tb.html" alt="NLM Technical Bulletin Home Page">
|
|
<area shape="RECT" coords="310,20,395,40" href="/pubs/techbull/back_issues.html" alt="Back Issues">
|
|
|
|
</map>
|
|
</td></tr>
|
|
|
|
|
|
<!--************************indicate date posted below**********************************-->
|
|
<!--************************add dates corrected to subsequent lines with [corrected] tag***************************-->
|
|
|
|
<tr>
|
|
<td width="30"> </td>
|
|
<td width="470"><font size="2"><strong>April 23, 2003</strong> [posted]</font><br /></td>
|
|
<td width="30"> </td>
|
|
</tr>
|
|
|
|
<tr><td width="30"> </td>
|
|
<td width="470"> </td>
|
|
</tr>
|
|
|
|
<!--************************indicate title of article below*********************-->
|
|
<tr>
|
|
<td width="30"> </td>
|
|
<td width="470"><font size="5"><strong>Implementation of New Guidelines for the
|
|
Structure and Nomenclature of Protein Concepts in MeSH</strong></font><br /></td>
|
|
<td width="30"> </td>
|
|
</tr>
|
|
|
|
<tr>
|
|
<td width="30"> </td>
|
|
<td width="470">
|
|
|
|
<br />
|
|
<p>
|
|
<!--************************change graphic to match first letter of article*************************-->
|
|
|
|
<img src="/pubs/techbull/new_tb_graphics/o.gif" border="0" align="left" alt="drop cap letter for o"/>
|
|
|
|
|
|
ver the past 23 years individual proteins appearing in the literature were indexed with use of supplemental concept records (SCRs).
|
|
During this period the process of new protein SCR creation was linked to the first appearance of protein sequence data in an article
|
|
cited in MEDLINE. The recent increase in published sequence data and the concurrent use of short acronym names for proteins has
|
|
resulted in the need to revise the MeSH protein thesaurus and develop a new system to accurately index and retrieve protein-related information.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
Under the new guidelines individual proteins are represented by SCRs, while descriptor records represent protein classes. Individual
|
|
proteins are defined in MeSH as a unique protein from a single species. Protein subunits, alternative mRNA splice variants and
|
|
polymorphic variants of the same protein may be included as subordinate concepts within the same record. To avoid confusing
|
|
proteins with similar or even identical names, the name of the protein is followed by the organism name from which it is derived. The
|
|
preferred name of each protein is the approved name found in curated genome databases followed by the organism name. The
|
|
curated genome databases have rooted-out duplicated and obsolete protein names. The use of a curated protein name followed by the
|
|
organism name for each protein results in the creation of highly specific and non-duplicated protein terminology. Specific examples of
|
|
re-formatted protein names are shown below.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
<strong>Examples of Organism-Specific Proteins</strong><br /><br />
|
|
|
|
Bzz1 protein, <strong>S cerevisiae</strong><br />
|
|
Cdc24 protein, <strong>S pombe</strong><br />
|
|
CYK-3 protein, <strong>C elegans</strong><br />
|
|
PagP protein,<strong> E coli</strong><br />
|
|
RPC5 protein, <strong>human</strong><br />
|
|
St7r protein, <strong>mouse</strong><br />
|
|
TRIPTYCHON protein, <strong>Arabidopsis</strong><br />
|
|
</p>
|
|
|
|
<p>
|
|
To effectively handle the vast numbers of new, organism-specific, proteins discussed in the literature, the creation of new SCR
|
|
protein records is limited to proteins from specific model organisms and to proteins of special biomedical importance such as proteins
|
|
directly involved in pathogenesis and those used as therapeutic agents or as diagnostic reagents. All other proteins will be represented
|
|
by coordination of a MeSH protein class descriptor and a MeSH organism descriptor. The list of MeSH model organisms includes: human,
|
|
mouse, rat, Drosophila, Xenopus, <em>S cerevisiae</em>, <em>S pombe</em>, <em>E coli</em>,
|
|
Zebrafish, <em>C elegans</em> and Arabidopsis. In the future additional model organism categories may be added to the protein SCRs.
|
|
These categories will be initially represented by the existing organism-specific SCRs that were previously considered
|
|
biomedically important and could be supplemented by the creation of new SCRs from protein information derived from
|
|
authoritative sources.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
The existing supplemental concept records for proteins are being revised to conform to these new guidelines. Current SCRs that represent
|
|
individual proteins are being reformatted to include organism-specific protein terms and official preferred terms from curated databases.
|
|
Non-specific terms are being removed. SCRs that represent a class of proteins are being promoted to MeSH descriptors, while specific
|
|
proteins found in the record are being demoted to organism-specific SCRs. In addition current SCRs that represent multiple proteins or the
|
|
same protein from multiple organisms will be broken into individual protein SCRs.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
In each case, appropriate maintenance will be performed on MEDLINE citations. Some of this maintenance was done for the 2003 system as part of year-end processing,
|
|
some may occur throughout calendar year 2003, and the majority should be accomplished with the successful completion of year-end processing for the 2004 system.
|
|
Some situations are straightforward and the old SCR in a MEDLINE Name of Substance element is simply replaced by the new form of the name. Other situations are
|
|
complex and involve breaking up a single SCR that previously referred to multiple proteins into individual, organism-specific SCRs. These require search strategies against
|
|
PubMed to isolate the citations that need maintenance to one or more of the new SCRs. For example, last year the SCR of transcription factor TFIIA was promoted to a
|
|
MeSH Heading and two new SCRs were created: TOA1 protein, S cerevisiae and TOA2 protein, S cerevisiae. During year-end processing, these two searches were run against PubMed:
|
|
</p>
|
|
|
|
<ol>
|
|
|
|
<li>TOA1 [tw] AND transcription factor TFIIA [nm] AND medline [sb]</li>
|
|
|
|
<li>TOA2 [tw] AND transcription factor TFIIA [nm] AND medline [sb]</li>
|
|
|
|
</ol>
|
|
|
|
|
|
<p>
|
|
The new SCR of TOA1 protein, S cerevisiae was added as a new name of substance to the citations that were found by Search 1 while TOA2 protein, S cerevisiae
|
|
was added as a new name of substance to the citations that were found by Search 2.
|
|
</p>
|
|
|
|
<p>
|
|
Thus far we have identified and revised over 10,000 organism-specific protein SCRs, of which 5,800 are from model organisms.
|
|
Based upon the current rate of editing 70% of existing protein SCRs will be completed by the release of 2004 MeSH. Upon completion
|
|
of this project we anticipate having approximately 30,000 organism-specific proteins represented as individual protein SCRs in MeSH.
|
|
</p>
|
|
|
|
|
|
|
|
<br /><br />
|
|
<p>
|
|
<!--************************indicate article author and section below*******************************-->
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<strong>By James M. Pash, Ph.D.</strong><br />
|
|
<strong>MeSH Section</strong><br />
|
|
|
|
</p>
|
|
<!--************************indicate indexing terms below**********************-->
|
|
<!--
|
|
<p><strong>Indexing Terms</strong></p>
|
|
<p>MeSH, New Guidelines for the Structure and Nomenclature of Protein Concepts</p>
|
|
<p>Proteins, New Guidelines for the MeSH Structure and Nomenclature</p>
|
|
<p>Supplemental Concept Records (SCR), New Guidelines for the Structure and Nomenclature</p>
|
|
-->
|
|
<!--************************end indexing terms*********************************-->
|
|
|
|
|
|
<img src="/pubs/techbull/new_tb_graphics/black_pixel.gif" width="450" height="1" alt="black line separting article from citation"/>
|
|
|
|
<p>
|
|
<!--************************change citation information below*****************************-->
|
|
<em>Pash JM. Implementation of New Guidelines for
|
|
the Structure and Nomenclature of Protein Concepts in MeSH. NLM Tech Bull. 2003 Mar-Apr;(331):e10.</em>
|
|
<!--************************end citation information********************************-->
|
|
</p>
|
|
|
|
</td><td width="30"> </td></tr>
|
|
</table>
|
|
<br />
|
|
|
|
|
|
<br />
|
|
|
|
<img src="/pubs/techbull/new_tb_graphics/footer_331.gif" border="0" alt="Article Navigation Bar" usemap="#footer_final"/>
|
|
|
|
<map name="footer_final" id="footer_final">
|
|
<area shape="RECT" coords="265,5,315,30" href="/pubs/techbull/tb.html" alt="NLM Technical Bulletin Home Page">
|
|
<area shape="RECT" coords="330,5,420,30" href="/pubs/techbull/back_issues.html" alt="Back Issues">
|
|
<area shape="RECT" coords="435,5,480,30" href="/pubs/techbull/new_index.html" alt="Index">
|
|
<area shape="RECT" coords="5,5,90,20" href="/pubs/techbull/ma03/ma03_nci.html" alt="Previous Page">
|
|
<area shape="RECT" coords="445,6,504,20" href="/pubs/techbull/ma03/ma03_cancer_subset.html" alt="Next Article">
|
|
|
|
|
|
</map>
|
|
</center>
|
|
|
|
<!-- BEGIN NLM FOOTER -->
|
|
<center>
|
|
<font size="2" face="helvetica, arial"><a href="/nlmhome.html">U.S. National Library of Medicine</a>, 8600 Rockville Pike, Bethesda, MD 20894<br /><a href="http://www.nih.gov/">National Institutes of Health</a>, <a href="//www.hhs.gov/">Department of Health & Human Services</a><br /><a href="/copyright.html">Copyright</a>, <a href="/privacy.html">Privacy</a>, <a href="/accessibility.html">Accessibility</a>, <a href="http://www.nih.gov/icd/od/foia/index.htm">Freedom of Information Act (FOIA)</a><br/><a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html">HHS Vulnerability Disclosure</a>
|
|
<br />
|
|
<!-- ******************MODIFY "LAST UPDATED" ******************* -->
|
|
Last updated: 11 April 2012
|
|
</font>
|
|
</center>
|
|
|
|
<!-- END NLM FOOTER -->
|
|
<!-- ******************MODIFY EXPDATE AND EMAIL BELOW****************** -->
|
|
<!-- EXPDATE="2015-03-20" -->
|
|
<!-- EMAIL="nlmtechbull@mail.nlm.nih.gov" -->
|
|
<script src="//assets.nlm.nih.gov/jquery/jquery-latest.min.js"></script><script src="/core/nlm-notifyExternal/1.0/nlm-notifyExternal.min.js"></script></body>
|
|
</html>
|