Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Jul;37(Web Server issue):W141-6.
doi: 10.1093/nar/gkp353. Epub 2009 May 8.

MedlineRanker: flexible ranking of biomedical literature

Affiliations

MedlineRanker: flexible ranking of biomedical literature

Jean-Fred Fontaine et al. Nucleic Acids Res. 2009 Jul.

Abstract

The biomedical literature is represented by millions of abstracts available in the Medline database. These abstracts can be queried with the PubMed interface, which provides a keyword-based Boolean search engine. This approach shows limitations in the retrieval of abstracts related to very specific topics, as it is difficult for a non-expert user to find all of the most relevant keywords related to a biomedical topic. Additionally, when searching for more general topics, the same approach may return hundreds of unranked references. To address these issues, text mining tools have been developed to help scientists focus on relevant abstracts. We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The results page is composed of several sections. The table of significant abstracts (A), here related to microarray and protein aggregation, is sorted by ascending P-values and shows article titles and PubMed identifiers (PMIDs). Discriminative words are highlighted in titles or in abstracts, which are displayed in a popup window hyperlinked from their PMID (B). The performance of the ranking is shown using a table and the corresponding Receiver Operating Characteristic curve plotting the sensitivity versus the false positive rate (C). The last section contains the table of discriminative words (D), which is sorted by decreasing weights (the most important words at the top).
Figure 2.
Figure 2.
Parameter estimation. The number of abstracts in the background and training sets has an impact on the ROC area for various biomedical topics. The y-axis shows the mean ROC area after leave-one-out cross validations over 10 random background sets using 1000 training set abstracts (left column), or 10 bootstrapped training sets using the rest of Medline as background set (right column).

Similar articles

Cited by

References

    1. Perez-Iratxeta C, Bork P, Andrade MA. XplorMed: a tool for exploring MEDLINE abstracts. Trends Biochem. Sci. 2001;26:573–575. - PubMed
    1. Doms A, Schroeder M. GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Res. 2005;33:W783–W786. - PMC - PubMed
    1. Yamamoto Y, Takagi T. Biomedical knowledge navigation by literature clustering. J. Biomed. Inform. 2007;40:114–130. - PubMed
    1. Rebholz-Schuhmann D, Kirsch H, Arregui M, Gaudan S, Rynbeek M, Stoehr P. Protein annotation by EBIMed. Nat. Biotechnol. 2006;24:902–903. - PubMed
    1. Siadaty MS, Shu J, Knaus WA. Relemed: sentence-level search engine with relevance score for the MEDLINE database of biomedical articles. BMC Med. Inform. Decis. Mak. 2007;7:1. - PMC - PubMed

Publication types