NCBI Resources Highlighted in 2025 Nucleic Acids Research Database Issue

NCBI Resources Highlighted in 2025 Nucleic Acids Research Database Issue

The 2025 Nucleic Acids Research Database Issue features papers from NCBI staff on ClinVar, PubChem, GenBank, RefSeq, and more. The citations are available in PubMed with full-text available in PubMed Central (PMC). To read an article, click on the PMCID number listed below. 

Database resources of the National Center for Biotechnology Information in 2025

PMCID: PMC11701734

NCBI provides online information resources for biology, including the GenBank® nucleic acid sequence repository and the PubMed® repository of citations and abstracts published in life science journals. NCBI is currently developing the NIH Comparative Genomics Resource (CGR) to facilitate reliable comparative genomics analyses with an NCBI Toolkit and community collaboration.

ClinVar: updates to support classifications of both germline and somatic variants

PMCID: PMC11701624

ClinVar is a free, public database of human genetic variants and their relationships to disease, with >3 million variants submitted by >2800 organizations across the world. The database was recently updated to have three types of classifications: germline, oncogenicity and clinical impact for somatic variants. 

PubChem 2025 update

PMCID: PMC11701573

PubChem is a large and highly-integrated public chemical database resource at NIH. In the past two years, significant updates were made to PubChem. With additions from over 130 new sources, PubChem contains >1000 data sources, 119 million compounds, 322 million substances and 295 million bioactivities. 

GenBank 2025 update

PMCID: PMC11701615

GenBank® is a comprehensive, public data repository that contains 34 trillion base pairs from over 4.7 billion nucleotide sequences for 581 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. 

The international nucleotide sequence database collaboration (INSDC): enhancing global participation

PMCID: PMC11701530

The members of the International Nucleotide Sequence Database Collaboration (INSDC) have built systems to collect, archive and disseminate sequence data for more than four decades. The three collaborating organizations, the National Library of Medicine, National Center for Biotechnology Information (NLM-NCBI) in the United States, Research Organization of Information and Systems, National Institute of Genetics (ROIS-NIG) in Japan; and the European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) formalized their relationship through the adoption of an arrangement which documents their commitment to free and open access to genomic sequences.

The evolution of dbSNP: 25 years of impact in genomic research

PMCID: PMC11701571

The Single Nucleotide Polymorphism Database (dbSNP), established in 1998 by NCBI, has been a critical resource in genomics for cataloging small genetic variations. Originally focused on single nucleotide polymorphisms (SNPs), dbSNP has since expanded to include a variety of genetic variants, playing a key role in genome-wide association studies (GWAS), population genetics, pharmacogenomics, and cancer research. 

NCBI RefSeq: reference sequence standards through 25 years of curation and annotation

PMCID: PMC11701664

The Reference Sequence (RefSeq) resource created at NCBI leverages both automatic processes and expert curation to create a robust set of reference sequences of genomic, transcript and protein data spanning the tree of life. RefSeq continues to refine its annotation and quality control processes and utilize better quality genomes resulting from advances in sequencing technologies as well as RNA-Seq data to produce high-quality annotated genomes, ortholog predictions across more organisms and other products that are easily accessible through multiple NCBI resources. 

NCBI Taxonomy: enhanced access via NCBI Datasets

PMCID: PMC11701650

The NCBI Taxonomy resource has long been a trusted, curated hub for organism names, classifications, and links to related data for all taxonomic nodes. NCBI Datasets is an improved way to leverage the rich data available at NCBI so users can effectively browse, search, and download information. 

COG database update 2024

PMCID: PMC11701660

The Clusters of Orthologous Genes (COG) database, originally created in 1997, has been updated to reflect the constantly growing collection of completely sequenced prokaryotic genomes. This update increased the genome coverage from 1309 to 2296 species, including 2103 bacteria and 193 archaea, in most cases, with a single representative genome per genus.

Stay up to date

Follow us on Twitter @NCBI and join our mailing list to keep up to date with NCBI news.

Questions?

If you have questions or would like to provide feedback about NCBI products or tools, please reach out to us at info@ncbi.nlm.nih.gov

Leave a Reply