{"id":11469,"date":"2023-05-25T08:46:46","date_gmt":"2023-05-25T12:46:46","guid":{"rendered":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/?p=11469"},"modified":"2023-05-25T09:01:36","modified_gmt":"2023-05-25T13:01:36","slug":"randomized-data-ncbi-virus","status":"publish","type":"post","link":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/2023\/05\/25\/randomized-data-ncbi-virus\/","title":{"rendered":"Download Randomized Data Subset from NCBI Virus"},"content":{"rendered":"

Do you need <\/span>a <\/span>smaller <\/span>dataset for your analyses of virus data? <\/span><\/span>In response to your feedback,<\/span> <\/span>NCBI Virus<\/span><\/span><\/a> now <\/span>allows you to<\/span> d<\/span>ownload a<\/span> randomized subset of <\/span>your results<\/span> for nucleotide, protein, or <\/span>RefSeq<\/span> genome sequences from any supported virus<\/span>\u00a0<\/span>(Figure 1)<\/span>.<\/span> This option is useful for viruses such as SARS-CoV-2 or Influenza A that have very large numbers of records, where the entire dataset may present a challenge. In such cases, a smaller representative sample is easier to work with to support your analysis. You can also reduce the bias in a dataset by getting a representative number of records for each country or host (Figure 2).\u00a0<\/span><\/span>\u00a0<\/span><\/p>\n

\"\"<\/p>\n

Figure 1: Virus Download Results menu with the option to \u201cDownload a randomized subset of all records (up to 2,000)\u201d\u00a0<\/em><\/p>\n

\u201cDownload a randomized subset\u201d option provides two types of random subsets:\u00a0<\/span>\u00a0<\/span><\/p>\n