COMPUTATIONAL BIOLOGY BRANCH
Biomedical information retrieval and text analysis


Instructions for log data download

Results on comparing log data

Understanding PubMed® user search behavior through log analysis

Abstract: An investigation of user search behaviors was conducted through the analysis of one month of PubMed logs. Each step of users' interactions with PubMed during a biomedical search process is characterized in detail with evidence from PubMed logs. Despite sharing many features in common with general Web searches, biomedical information searches have unique characteristics that were evidenced in this study. An analysis of these characteristics plays a critical role in identifying users' information needs and their search habits and, in turn, provides useful insight to improve biomedical information retrieval through PubMed. Click here for full text.

Authors: Rezarta Islamaj Dogan, G. Craig Murray, Aurélie Névéol
and Zhiyong Lu

Please use this as a reference when citing this work: Database (2009) Vol. 2009, bap018; doi:10.1093/database/bap018

Contacts: luzh@ncbi.nlm.nih.gov; islamaj@ncbi.nlm.nih.gov


Instructions for data download


Results on comparing log data of March 08 vs. February 09

In order to investigate the temporal factor and other ephemeral trends,
we analyzed same kind of log data for February 2009 and compared
its results to those based on March, 2008.  

Comparing user actions (note that there are 31 days in March 2008, whereas only 28 days in Feburary 2009)

 

March 08

February 09

Queries

58,026,098

58,666,967

Abtract View

67,093,786

65,049,452

Fulltext View

27,581,850

23,507,979

 

 

 

Avg Queries /Day

1,871,815

2,095,249

Avg Abstract View /Day

2,164,319

2,323,195

Avg Fulltext View /Day

889,740

839,571

 

 

 

Total Number of User Actions

152,701,734

147,224,398

Total Number of User Sessions

23,017,461

28,011,966



Comparing query statistics: Here we show the proportion of queries according to the number of tokens (white-space separated) for both sets of data.
As can be seen from the figure below there are no major differences, and both results suggest that PubMed queries are short.



Comparing click positions: Here we show the proportion of user clicks according to the position of returned results for both sets of data.


Figure above shows the click positions on the first result page.


Figure above shows the clicks positions on the other result pages (ratio is computed per page).

Comparing search result size: Here we show the proportion of queries according to the size of returned results (in log scale) for both sets of data.