Distributed modules for text annotation and IE applied to the biomedical domain
- PMID: 16085453
- DOI: 10.1016/j.ijmedinf.2005.06.011
Distributed modules for text annotation and IE applied to the biomedical domain
Abstract
Biological databases contain facts from scientific literature that have been curated by hand to ensure high quality. Curation is time-consuming and can be supported by information extraction methods. We present a server software infrastructure which allows to easily plug in modules to identify biologically interesting pieces of text to be then presented in a web interface to the curator. There are modules which identify UniProt, UMLS and GO terminology, gene and protein names, mutations and protein-protein interactions. UniProt, UMLS and GO concepts are automatically linked to the original source. The module for mutations is based on syntax patterns and the one for protein-protein interactions relies on chunk parsing. All modules work as separate servers possibly distributed on different machines and can be combined into processing pipelines as necessary. Communication is based on XML annotated text streams, each server processing the XML elements it is designed for, and possibly adding more information in the form of XML annotation. The server and the underlying software are available to the public.
Similar articles
-
GeneInfoMiner--a web server for exploring biomedical literature using batch sequence ID.Bioinformatics. 2005 Aug 15;21(16):3452-3. doi: 10.1093/bioinformatics/bti559. Epub 2005 Jun 30. Bioinformatics. 2005. PMID: 15994195
-
Protein annotation by EBIMed.Nat Biotechnol. 2006 Aug;24(8):902-3. doi: 10.1038/nbt0806-902. Nat Biotechnol. 2006. PMID: 16900125 No abstract available.
-
An architecture for biological information extraction and representation.Bioinformatics. 2005 Feb 15;21(4):430-8. doi: 10.1093/bioinformatics/bti187. Epub 2004 Dec 17. Bioinformatics. 2005. PMID: 15608051
-
Status of text-mining techniques applied to biomedical text.Drug Discov Today. 2006 Apr;11(7-8):315-25. doi: 10.1016/j.drudis.2006.02.011. Drug Discov Today. 2006. PMID: 16580973 Review.
-
Text mining and ontologies in biomedicine: making sense of raw text.Brief Bioinform. 2005 Sep;6(3):239-51. doi: 10.1093/bib/6.3.239. Brief Bioinform. 2005. PMID: 16212772 Review.
Cited by
-
Semantic annotation in biomedicine: the current landscape.J Biomed Semantics. 2017 Sep 22;8(1):44. doi: 10.1186/s13326-017-0153-x. J Biomed Semantics. 2017. PMID: 28938912 Free PMC article. Review.
-
PaperMaker: validation of biomedical scientific publications.Bioinformatics. 2010 Apr 1;26(7):982-4. doi: 10.1093/bioinformatics/btq060. Epub 2010 Mar 3. Bioinformatics. 2010. PMID: 20200010 Free PMC article.
-
Evaluation and cross-comparison of lexical entities of biological interest (LexEBI).PLoS One. 2013 Oct 4;8(10):e75185. doi: 10.1371/journal.pone.0075185. eCollection 2013. PLoS One. 2013. PMID: 24124474 Free PMC article.
-
Monitoring named entity recognition: the League Table.J Biomed Semantics. 2013 Sep 13;4(1):19. doi: 10.1186/2041-1480-4-19. J Biomed Semantics. 2013. PMID: 24034148 Free PMC article.
-
Featured Article: Genotation: Actionable knowledge for the scientific reader.Exp Biol Med (Maywood). 2016 Jun;241(11):1202-9. doi: 10.1177/1535370216633795. Epub 2016 Feb 21. Exp Biol Med (Maywood). 2016. PMID: 26900164 Free PMC article.
MeSH terms
LinkOut - more resources
Full Text Sources