dbVar non-redundant SV (NR SV) datasets include more than 2.2 million deletions, 1.1 million insertions, and 300,000 duplications. These data are aggregated from over 150 studies including 1000 Genomes Phase 3, Simons Genome Diversity Project, ClinGen, ExAC, and others. You can use NR SV data files to filter and annotate variants in a broad range of applications:
- Clinicians can easily filter patients’ genome data to find SV that overlap with variants previously reported as clinically significant.
- Researchers can compare the results of their own genome-wide SV surveys with dbVar NR data to identify variants that are novel or rare, those which may be pathogenic, and in some cases obtain allele frequencies for matching variants. Users can also annotate SV data with NR SV and other genomic annotations to prioritize those variants most likely to impact biological function.
- Developers of variant analysis pipelines can use dbVar NR data to help identify novel variants, calibrate their algorithms, or simply integrate the data into downstream analysis tools and workflows.
dbVar’s NR SV reference data are updated monthly. These updates include new database submissions. We welcome your feedback on the content and usability of these files so that we can improve them.
For more information, please see our GitHub site, which includes brief tutorials and access to NR SV datasets by >FTP.