DNorm: Disease Named Entity Recognition and Normalization with Pairwise Learning to Rank
Authors: Robert Leaman, Rezarta Islamaj Dogan and Zhiyong Lu (PI)
Research highlights (demo)
DNorm is an automated method for determining which diseases are mentioned in biomedical text, the task of disease normalization. Diseases have a central role in many lines of biomedical research, making this task important for many lines of inquiry, including etiology (e.g. gene-disease relationships) and clinical aspects (e.g. diagnosis, prevention, and treatment). DNorm is a high-performing and mathematically principled framework for learning similarities between mentions and concept names directly from training data. DNorm is the first technique to use machine learning to normalize disease names and also the first method employing pairwise learning to rank in a normalization task. DNorm achieved the best performance in the 2013 ShARe/CLEF shared task on disease normalization in clinical notes.
Method overview
The technique consists of series of processing steps summarized in Figure 1 and described below.

Results
We evaluated the system on the NCBI Disease Corpus test set at the level of associations between the disease concept and the abstract, not individual mentions.
Method | Precision | Recall | F-measure |
NLM Lexical Normalization | 0.218 | 0.685 | 0.331 |
MetaMap | 0.502 | 0.665 | 0.572 |
Inference Method | 0.533 | 0.662 | 0.591 |
BANNER + Lucene | 0.612 | 0.647 | 0.629 |
BANNER + cosine similarity | 0.649 | 0.674 | 0.661 |
DNorm (BANNER + pLTR) | 0.803 | 0.763 | 0.782 |
Downloads
DNorm Software
NCBI Disease
Corpus
DNorm-tagged PubMed results in PubTator
DNorm
RESTful API
Please cite
- Robert Leaman, Rezarta Islamaj Dog
<8C>an and Zhiyong Lu. DNorm: Disease Name Normalization with Pairwise Learning to Rank. Bioinformatics (2013) 29 (22): 2909-2917, doi:10.1093/bioinformatics/btt474 - Robert Leaman, Ritu Khare and Zhiyong Lu. NCBI at 2013 ShARe/CLEF eHealth Share Task: Disorder Normalization in Clinical Notes with DNorm. Working Notes of the Conference and Labs of the Evaluation Forum (2013)
- Robert Leaman and Zhiyong Lu. Automated Disease Normalization with Low Rank Approximations. Proceedings of BioNLP 2014: pp 24-28