Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2006 Jan-Feb;13(1):96-105.
doi: 10.1197/jamia.M1909. Epub 2005 Oct 12.

Using citation data to improve retrieval from MEDLINE

Affiliations
Comparative Study

Using citation data to improve retrieval from MEDLINE

Elmer V Bernstam et al. J Am Med Inform Assoc. 2006 Jan-Feb.

Abstract

To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Experimental and query design.
Figure 2.
Figure 2.
Hit and recall-precision curves.
Figure 3.
Figure 3.
Comparison of PageRank and citation count using 2001 versus 2005 data (hit curves and recall-precision curves).

Similar articles

Cited by

References

    1. Zipser J. MEDLINE to PubMed and beyond. Available from: http://www.nlm.nih.gov/bsd/historypresentation.html/. Accessed 2005 Mar 29.
    1. PubMed. Available from: http://www.ncbi.nlm.nih.gov/pubmed/. Accessed 2005 Mar 29.
    1. NLM. MEDLINE Fact Sheet. WWW. September 18, 2002. Available from: http://www.nlm.nih.gov/pubs/factsheets/medline.html/. Accessed 2003 Apr 30.
    1. Wilson SR, Starr-Schneidkraut N, Cooper MD. Use of the critical incident technique to evaluate the impact of MEDLINE. Final Report. Palo Alto, CA, 1989.
    1. Marshall JG. The impact of information provided by the hospital libraries in the Rochester area: Rochester Regional Library Council, 1991.

Publication types