{"id":13217,"date":"2024-05-02T13:00:44","date_gmt":"2024-05-02T17:00:44","guid":{"rendered":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/?p=13217"},"modified":"2024-05-02T13:10:23","modified_gmt":"2024-05-02T17:10:23","slug":"automated-lineage-definitions-ncbi-virus-sars-cov-2-variants-overview","status":"publish","type":"post","link":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/2024\/05\/02\/automated-lineage-definitions-ncbi-virus-sars-cov-2-variants-overview\/","title":{"rendered":"Automated Lineage Definitions Now Available in NCBI Virus SARS-CoV-2 Variants Overview"},"content":{"rendered":"

Recently<\/span>, NCBI Virus <\/span><\/span>SARS-CoV-2 Variants Overview<\/span><\/span><\/a> moved <\/span>from a <\/span>manual <\/span>to<\/span> an automated process <\/span>for <\/span><\/span>selecting mutations required to define a lineage<\/span><\/span> (e.g.<\/span>,<\/span> Omicron, <\/span>BA.2, <\/span>JN.1<\/span>, etc.)<\/span>.<\/span> With this update, the SARS-CoV-2 Variant Overview <\/span>provides <\/span>coverage for<\/span> all <\/span>SARS-CoV-2 <\/span>lineages<\/span> and is no longer limited to <\/span>only lineages with <\/span>CDC<\/span> status<\/span>.<\/span> The <\/span>SARS-CoV-2 Variants Overview <\/span>website <\/span>reports results from analyzing both <\/span><\/span>GenBank<\/span><\/span><\/a> and unassembled <\/span><\/span>Sequence Read Archive (<\/span>SRA<\/span>)<\/span><\/span><\/a> sequence data<\/span>.<\/span> It<\/span> allows <\/span>you <\/span>to<\/span> <\/span>view geographic and frequency trends<\/span><\/span><\/a> of <\/span>records assigned to Pango lineages<\/a> <\/span>and <\/span><\/span>search for sequence records<\/span><\/span><\/a> using lineage-defining or other <\/span>mutations<\/span> (example shown in Figure 1)<\/span>.\u00a0<\/span><\/span>\u00a0<\/span><\/p>\n

\"Screenshot<\/p>\n

Figure 1: On the “Lineage Frequency and Location” tab, select a Pango lineage such as JN.1 to access details including the lineage-defining mutations, the change in frequency of the lineage in GenBank and SRA records, and the geographic locations where the samples were collected.<\/em>\u00a0<\/span><\/p>\n

Automated lineage definition<\/span><\/b>\u00a0<\/span><\/h5>\n

Pangolin<\/a> in UShER mode is run daily to assign lineages to all GenBank sequences. The new automated lineage definition pipeline identifies the mutations that are characteristic and unique to each Pango lineage. Specifically, the pipeline creates a set of mutations defining a single lineage according to the following rules:<\/span>\u00a0<\/span><\/p>\n