{"id":13138,"date":"2024-04-22T10:21:13","date_gmt":"2024-04-22T14:21:13","guid":{"rendered":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/?p=13138"},"modified":"2024-04-22T10:21:13","modified_gmt":"2024-04-22T14:21:13","slug":"cleaner-blast-databases-more-accurate-results","status":"publish","type":"post","link":"https:\/\/ncbiinsights.ncbi.nlm.nih.gov\/2024\/04\/22\/cleaner-blast-databases-more-accurate-results\/","title":{"rendered":"Cleaner BLAST Databases for More Accurate Results"},"content":{"rendered":"
Do you use <\/span>BLAST<\/span><\/a> to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address<\/span> this problem<\/span>, we now use the NCBI quality assurance tools listed below to systematically remove these misleading sequences from the default nucleotide (nt) and protein (nr) BLAST databases.<\/span>\u00a0<\/span><\/p>\n This process has removed approximately 2.23% of sequences from nr and 0.01% from nt. Lists of nucleotide and protein sequences identified as contaminant or misclassified are available from our FTP site<\/a>.<\/span> \u00a0<\/span><\/p>\n BLAST is part of the\u202f<\/span>NIH Comparative Genomics Resource (CGR)<\/span><\/a>.\u202fCGR facilitates reliable comparative genomics analyses\u202ffor all eukaryotic organisms through an NCBI Toolkit and community collaboration.\u202f\u202f<\/span>\u00a0<\/span><\/p>\n Follow us on social\u202f<\/span>@NCBI<\/span><\/a>\u202fand\u202f<\/span>join our mailing list<\/span><\/a>\u202fto keep up to date with BLAST and other CGR news.<\/span>\u00a0<\/span><\/p>\n We want to hear from you! Try it out and let us know what you think. We are making ongoing improvements based on your feedback. If you have questions or would like to provide feedback, please reach out to us at\u202f<\/span>info@ncbi.nlm.nih.gov<\/span><\/a>.\u202f\u202f<\/span>\u00a0<\/span><\/p>\n","protected":false},"excerpt":{"rendered":" Removing contaminated sequences using NCBI quality assurance tools\u00a0 Do you use BLAST to identify a sequence or the evolutionary scope of a gene? That can be challenging if contaminated and misclassified sequences are in the BLAST databases and show up in your search results. To address this problem, we now use the NCBI quality assurance … Continue reading Cleaner BLAST Databases for More Accurate Results<\/span> \n
Stay up to date<\/h5>\n
Questions?<\/h5>\n