Site icon NCBI Insights

Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview

Colorful virus cells in the background. Text says: NCBI Virus SARS-CoV-2 Variants Overview Automated Lineage Definitions Now Available.

Recently, NCBI Virus SARS-CoV-2 Variants Overview moved from a manual to an automated process for selecting mutations required to define a lineage (e.g., Omicron, BA.2, JN.1, etc.). With this update, the SARS-CoV-2 Variant Overview provides coverage for all SARS-CoV-2 lineages and is no longer limited to only lineages with CDC status. The SARS-CoV-2 Variants Overview website reports results from analyzing both GenBank and unassembled Sequence Read Archive (SRA) sequence data. It allows you to view geographic and frequency trends of records assigned to Pango lineages and search for sequence records using lineage-defining or other mutations (example shown in Figure 1) 

Figure 1: On the “Lineage Frequency and Location” tab, select a Pango lineage such as JN.1 to access details including the lineage-defining mutations, the change in frequency of the lineage in GenBank and SRA records, and the geographic locations where the samples were collected. 

Automated lineage definition 

Pangolin in UShER mode is run daily to assign lineages to all GenBank sequences. The new automated lineage definition pipeline identifies the mutations that are characteristic and unique to each Pango lineage. Specifically, the pipeline creates a set of mutations defining a single lineage according to the following rules: 

The pipeline subsequently uses these predefined sets of mutations to assign lineages to SRA and GenBank sequence records. A sequence record must contain 100% of the lineage-defining mutations for it to be assigned to that lineage. After a sequence is released to the public, it is usually classified into a lineage and available through the NCBI Virus SARS-CoV-2 Variants Overview webpage within a couple of days. Mutation sets are recalculated weekly, and all sequence records are reclassified periodically. More details on the lineage definition pipeline can be found in NCBI Virus help documentation. 

Stay up to date 

Follow us on social media @NCBI and join our mailing list to keep up to date with NCBI Virus and other NCBI news. 

Questions? 

Please reach out to us with questions or feedback. 

 

Exit mobile version