Predicting biomedical document access as a function of past use
- PMID: 21917645
- PMCID: PMC3341785
- DOI: 10.1136/amiajnl-2011-000325
Predicting biomedical document access as a function of past use
Abstract
Objective: To determine whether past access to biomedical documents can predict future document access.
Materials and methods: The authors used 394 days of query log (August 1, 2009 to August 29, 2010) from PubMed users in the Texas Medical Center, which is the largest medical center in the world. The authors evaluated two document access models based on the work of Anderson and Schooler. The first is based on how frequently a document was accessed. The second is based on both frequency and recency.
Results: The model based only on frequency of past access was highly correlated with the empirical data (R²=0.932), whereas the model based on frequency and recency had a much lower correlation (R²=0.668).
Discussion: The frequency-only model accurately predicted whether a document will be accessed based on past use. Modeling accesses as a function of frequency requires storing only the number of accesses and the creation date for the document. This model requires low storage overheads and is computationally efficient, making it scalable to large corpora such as MEDLINE.
Conclusion: It is feasible to accurately model the probability of a document being accessed in the future based on past accesses.
Conflict of interest statement
Figures






Similar articles
-
Predicting clicks of PubMed articles.AMIA Annu Symp Proc. 2013 Nov 16;2013:947-56. eCollection 2013. AMIA Annu Symp Proc. 2013. PMID: 24551386 Free PMC article.
-
G-Bean: an ontology-graph based web tool for biomedical literature retrieval.BMC Bioinformatics. 2014;15 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-15-S12-S1. Epub 2014 Nov 6. BMC Bioinformatics. 2014. PMID: 25474588 Free PMC article.
-
ESLMT: a new clustering method for biomedical document retrieval.Biomed Tech (Berl). 2019 Dec 18;64(6):729-741. doi: 10.1515/bmt-2018-0068. Biomed Tech (Berl). 2019. PMID: 31199756
-
Hot topics in Chinese herbal drugs research documented in PubMed/MEDLINE by authors inside China and outside of China in the past 10 years: based on co-word cluster analysis.J Altern Complement Med. 2009 Jul;15(7):779-85. doi: 10.1089/acm.2008.0594. J Altern Complement Med. 2009. PMID: 19534611 Review.
-
Tobacco document research reporting.Tob Control. 2005 Dec;14(6):368-76. doi: 10.1136/tc.2004.010132. Tob Control. 2005. PMID: 16319359 Free PMC article. Review.
Cited by
-
Predicting clicks of PubMed articles.AMIA Annu Symp Proc. 2013 Nov 16;2013:947-56. eCollection 2013. AMIA Annu Symp Proc. 2013. PMID: 24551386 Free PMC article.
References
-
- Wilson EO. Consilence. New York: Knoph, 1992
-
- Dennis C. Biology databases: information overload. Nature 2002;417:14 doi:10.1038/417014a - DOI - PubMed
-
- Stokstad E. Information overload hampers biology reforms. Science 2001;293:1609. - PubMed
-
- Fraser AG, Dunstan FD. On the impossibility of being expert. BMJ 2010;341:c6815. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Miscellaneous