nih-gov/www.nlm.nih.gov/pubs/techbull/ja18/ja18_indexing_method.html
2025-02-26 13:17:41 -05:00

237 lines
14 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
<meta http-equiv="X-UA-Compatible" content="IE=8;" />
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:site" content="@NLM_NIH">
<meta name="twitter:title" content="Incorporating Values for Indexing Method in MEDLINE/PubMed XML. NLM Technical Bulletin. 2018 Jul&#8211;Aug">
<meta name="twitter:description" content=" The NLM Technical Bulletin is your source to stay informed about NLM products and services.">
<meta name="twitter:image" content="https://www.nlm.nih.gov/pubs/techbull/images/nlm_tech_bulletin_graphic_twitter.jpg">
<meta property="og:url" content="https://www.nlm.nih.gov/pubs/techbull/tb.html" />
<meta property="og:type" content="article" />
<meta property="og:title" content="Incorporating Values for Indexing Method in MEDLINE/PubMed XML. NLM Technical Bulletin. 2018 Jul&#8211;Aug" />
<meta property="og:description" content="The NLM Technical Bulletin is your source to stay informed about NLM products and services." />
<meta property="og:image" content="https://www.nlm.nih.gov/pubs/techbull/images/nlm_tech_bulletin_graphic_facebook.jpg" />
<link type="text/css" href="/pubs/techbull/styles/reset.css" rel="stylesheet" />
<link type="text/css" href="/pubs/techbull/styles/technicalBulletin.css" rel="stylesheet" />
<!--Call jQuery-->
<script src="//assets.nlm.nih.gov/jquery/jquery-latest.min.js"></script>
<script src="//assets.nlm.nih.gov/jquery/jquery-migrate-latest.min.js"></script>
<script src="/pubs/techbull/scripts/techbull.js" type="text/javascript" language="javascript"></script>
<!--[if lte IE 8]>
<script type="text/javascript" src="/scripts/PIE.js"></script>
<![endif]-->
<link type="text/css" href="/pubs/techbull/styles/print.css" rel="stylesheet" media="print"/>
<title>Incorporating Values for Indexing Method in MEDLINE/PubMed XML. NLM Technical Bulletin. 2018 Jul&#8211;Aug</title>
<link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" title="The Dublin Core metadata Element Set" />
<meta name="DC.Title" content="Incorporating Values for Indexing Method in MEDLINE/PubMed XML" />
<meta name="DC.Publisher" content="U.S. National Library of Medicine" />
<meta name="DC.Date.Issued" content="2018-07-27" />
<meta name="DC.Date.Modified" content="2024-02-21" />
<meta name="NLMDC.Date.LastReviewed" content="2018-08-14" />
<meta name="NLM.Contact.Email" content="nlmtechbull@mail.nlm.nih.gov" />
<meta name="DC.Type" content="Newsletters" />
<meta name="NLM.Permanence.Level" content="Permanent: Stable Content" />
<meta name="DC.Rights" content="Public Domain" />
<meta name="DC.Language" content="eng" />
<meta name="DC.Subject.Keyword" content="MEDLINE" />
<meta name="DC.Subject.Keyword" content="Indexing" />
<meta name="DC.Subject.Keyword" content="Medical Text Indexer" />
<meta name="DC.Subject.Keyword" content="Extensible Markup Language" />
<meta name="DC.Subject.Keyword" content="Medical Subject Headings" />
<meta name="DC.Subject.Keyword" content="Document Type Definition" />
<!-- Google Tag Manager -->
<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='//www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-MT6MLL');</script>
<!-- End Google Tag Manager -->
</head>
<body>
<!-- Google Tag Manager -->
<noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-MT6MLL" height="0" width="0" style="display:none;visibility:hidden" title="googletagmanager"></iframe></noscript>
<!-- End Google Tag Manager -->
<div class="skipnavigation"><a title="Skip the navigation on this page" href="#skipnav" class="skipnavigation">Skip Navigation Bar</a></div>
<div>
<div class="header">
<img src="/pubs/techbull/images/tb_logo_113.jpg" alt="National Library of Medicine Technical Bulletin" title="National Library of Medicine Technical Bulletin" /><img src="/pubs/techbull/images/nlm_masthead_113.jpg" alt="National Library of Medicine Technical Bulletin" title="National Library of Medicine Technical Bulletin" usemap="#nlm_masthead_113" />
</div>
<div class="search_box">
<form method="get" action="//vsearch.nlm.nih.gov/vivisimo/cgi-bin/query-meta" target="_self" name="searchForm" class="searchForm">
<label class="displaynone" for="search">Search</label>
<input name="query" id="search" type="text" class="search-input inactive" size="50" onfocus="this.value=''" value="Search here for NLM Technical Bulletin articles" aria-label="Search NLM Technical Bulletin">
<input type="hidden" name="v:project" value="technical-bulletin">
</form>
</div>
</div>
<div id="nav">
<!--Open drop-->
<ul class="topnav">
<li class="currentissue"><a href="//www.nlm.nih.gov/pubs/techbull/current_issue.html">Current Issue</a> <img class="separator" src="/pubs/techbull/images/whitelinetransparentbackground.gif " alt=""/></li>
<li class="archive"><a href="//www.nlm.nih.gov/pubs/techbull/back_issues.html">Previous Issues</a> <img class="separator" src="/pubs/techbull/images/whitelinetransparentbackground.gif " alt=""/></li>
<li class="about"><a href="//www.nlm.nih.gov/pubs/techbull/about.html">About</a> <img class="separator" src="/pubs/techbull/images/whitelinetransparentbackground.gif " alt=""/></li>
<li class="staycurrent"><a href="//www.nlm.nih.gov/pubs/techbull/stay_current.html">Stay Current <img class="emaillogo" src="/pubs/techbull/images/email_20px.gif" alt="E-Mail Sign Up" style="margin-top: -4px;"/> <img class="rsslogo" src="/pubs/techbull/images/rss_20px.gif" alt="RSS Feed" style="margin-top: -4px;"/></a></li>
</ul>
<!--Close drop-->
</div>
<div class="body">
<a id="skipnav" name="skipnav"></a>
<div class="syndicate">
<p class="tableOfContents"><strong>Table of Contents: <a href="/pubs/techbull/ja18/ja18_issue_cover.html">2018 JULY&#8211;AUGUST No. 423</a></strong></p>
<p class="prevnext"><span class="buttons">
<span class="previous"><a href="ja18_medline2022.html">Previous</a></span> <span class="next"><a href="ja18_bibliographic_record_format_change.html">Next</a></span>
</span></p>
<hr class="hr1" />
<h1 class="articleH1">Incorporating Values for Indexing Method in MEDLINE/PubMed XML</h1>
<p class="tbyearmonth">Incorporating Values for Indexing Method in MEDLINE/PubMed XML. NLM Tech Bull. 2018 Jul-Aug;(423):e2.</p>
<div class="articleactions">2018 August 15 <span class="status">[posted]</span>
<br />
2018 October 04
<span class="status">
[<a href="#note">Editor's note added</a>]
</span>
</div>
<div class="articleParagraph">
<p><a name="note"></a><em>[Editor's note: This change was implemented in PubMed on October 2, 2018.]</em>
</p>
<p>
The MEDLINE/PubMed DTD was modified in 2017 to incorporate the attribute "<a href="https://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html#indexingmethod">IndexingMethod</a>" for the element &lt;MedlineCitation&gt; (see <a href="https://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html">MEDLINE/PubMed XML Element Descriptions and their Attributes</a>). Values will now be applied as appropriate for this attribute in citations indexed for MEDLINE to provide documentation of the method by which the set of <a href="https://www.nlm.nih.gov/mesh/intro_record_types.html">Medical Subject Heading (MeSH) indexing terms</a> was determined for a citation. IndexingMethod values are for computational analysis of MEDLINE XML and are not searchable in PubMed. It is particularly important for researchers using MEDLINE indexing as a gold standard for training machine learning algorithms to be able to identify in the MEDLINE XML those citations that were indexed solely by a human method versus those that were indexed by a semi-automated method (algorithm results reviewed by a human) or an automated method (algorithm alone). 
</p>
<p>
IndexingMethod is an implied attribute, meaning that it will only be present if a value is specified. If the IndexingMethod attribute is not present, <strong>the indexing method is fully human indexed</strong>.
</p>
<p>
The values to be added are:</p>
<dl>
<dt>
<dd><strong>Curated</strong> MeSH indexing is provided algorithmically and a human reviewed (and possibly modified) the algorithm results<br /></dd>
<dd><strong>Automated</strong> MeSH indexing is provided algorithmically<br /></dd>
</dl>
<p>
The algorithm that currently supports MEDLINE indexing is the <a href="https://ii.nlm.nih.gov/MTI/index.shtml">Medical Text Indexer</a> (MTI), a product of the National Library of Medicine (NLM) Indexing Initiative.
</p>
<p>
Beginning in September 2018, these values will be added as appropriate for newly completed MEDLINE citations. For previously completed citations that were indexed by one of these methods, values will be added with the 2019 <a href="https://www.nlm.nih.gov/bsd/licensee/baseline.html">MEDLINE/PubMed baseline</a> file that is produced in December.
</p>
<dl>
<dt>
<dd><strong>Curated</strong><br /></dd>
<dd>Citations with the value <strong>Curated</strong> are those for which MTI has been the "first line indexer," and a human has reviewed (and potentially modified) the results. This includes citations from <a href="https://web.archive.org/web/20190709043554/https://ii.nlm.nih.gov/MTI/MTIFL_Journal_List.pdf">approximately 650 journals</a> that currently have all citations completed by <a href="https://ii.nlm.nih.gov/MTI/MTIFL.shtml">MTI First Line</a> (MTIFL), and citations from issues of other journals for which humans have reviewed the MTI indexing for the citation. Upon implementation, approximately 18% of newly completed citations will have the value of <strong>Curated</strong>. With the 2019 MEDLINE/PubMed baseline, approximately 450,000 previously completed citations will have this value added. <br /><br />
</dd>
<dd><strong>Automated</strong><br /></dd>
<dd>Citations with the value <strong>Automated </strong>are citations for comments, which currently represent approximately 5% of newly completed citations.&nbsp;With the 2019 MEDLINE baseline, the value <strong>Automated</strong> will also be applied to OLDMEDLINE citations (approximately 2 million), previously completed comments (approximately 250,000 citations), and citations for an experimental automatically indexed batch that was completed in 2016 (approximately 11,000 citations). <br /></dd>
</dl>
<p>
Citations completed by an indexing method of <strong>Automated</strong> or <strong>Curated</strong> represent a small proportion of all MEDLINE citations. MEDLINE citations that have been completed by a human indexing method currently number approximately 22 million.
</p>
<p>
While MEDLINE indexing has traditionally involved full human curation, these automated and semi-automated methods of MEDLINE indexing have been explored in recent years to increase our efficiency and focus expert human effort in key areas to keep up with the ever-expanding volume of biomedical literature. In addition, NLM recently initiated <a href="https://www.nlm.nih.gov/pubs/techbull/ja18/ja18_medline2022.html">MEDLINE 2022: A Five-Year Development Plan</a> to maintain the usefulness of MEDLINE as a tool for discovering and analyzing the biomedical literature. One of the goals of the MEDLINE 2022 project is to implement a range of indexing methods to ensure the timely assignment of MeSH to MEDLINE citations. Providing XML data on the method used to index citations for MEDLINE supports our effort to be transparent about all facets of the MEDLINE 2022 project.
</p>
<p>
Additional information about the projects and citation sets mentioned in this article can be found here:
</p>
<dl>
<dt>
<dd><a href="https://ii.nlm.nih.gov/">NLM Indexing Initiative</a><br /></dd>
<dd><a href="https://ii.nlm.nih.gov/MTI/index.shtml">MTI and MTIFL</a> <br /></dd>
<dd><a href="https://www.nlm.nih.gov/databases/databases_oldmedline.html">OLDMEDLINE Data</a> <br /></dd>
<dd><a href="https://www.nlm.nih.gov/pubs/techbull/ja18/ja18_medline2022.html">MEDLINE 2022: A Five-Year Development Plan</a><br /></dd>
</dl>
<p>Please send any comments and questions regarding changes to the MEDLINE indexing process to <a href="https://support.nlm.nih.gov/?deptID=28054&amp;from=https://www.nlm.nih.gov/">NLM Support Center</a>. </p>
</div>
<p class="articleParagraph">
</p>
</div>
<div class="footer">
<p class="footerLeft"><span class="footerissn"><strong>ISSN 2161-2986 (Online)</strong> Content not copyrighted; freely reproducible.</span><br/>
<a href="/">National Library of Medicine</a> 8600 Rockville Pike, Bethesda, MD 20894
<br/>
<a href="//www.nlm.nih.gov/socialmedia/index.html">Connect with NLM</a>,
<a href="//www.nlm.nih.gov/web_policies.html">Web Policies</a>,
<a href="//www.nlm.nih.gov/careers/jobopenings.html">Careers</a>,
<a href="//www.nlm.nih.gov/accessibility.html">Accessibility</a>,
<a href="//www.usa.gov/" id="anch_34">USA.gov</a>,
<a href="//www.hhs.gov/vulnerability-disclosure-policy/index.html">HHS Vulnerability Disclosure</a>
<br/>
<a href="//www.nih.gov/">NIH</a>,
<a href="https://www.hhs.gov/">HHS</a>,
<a href="//www.nih.gov/institutes-nih/nih-office-director/office-communications-public-liaison/freedom-information-act-office">FOIA</a>,
<a class="supportLink" href="//support.nlm.nih.gov?from=" target="_blank">NLM Support Center</a>
</p>
<p class="footerRight">
<strong>Last updated:</strong> 21 February 2024</p>
</div>
</div>
<map id="nlm_masthead_113" name="nlm_masthead_113">
<area shape="rect" alt="NLM Technical Bulletin" coords="1,15,396,45" href="//www.nlm.nih.gov/pubs/techbull/tb.html" title="NLM Technical Bulletin" />
<area shape="rect" alt="National Library of Medicine" coords="0,47,203,62" href="//www.nlm.nih.gov/" title="National Library of Medicine" />
<area shape="rect" coords="207,47,396,62" href="//www.nih.gov/" alt="National Institutes of Health" title="" />
</map>
<!--*****************************Content end*******************************-->
<script src="/scripts/support.js"></script>
<script src="/core/nlm-notifyExternal/1.0/nlm-notifyExternal.min.js"></script>
</body>
</html>