nih-gov/www.ncbi.nlm.nih.gov/mailman/pipermail/dbsnp-announce/2014q4/000147.html

148 lines
12 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<TITLE> [dbsnp-announce] dbSNP Human Build 142 (GRCh38 and GRCh37.p13)
</TITLE>
<LINK REL="Index" HREF="index.html" >
<LINK REL="made" HREF="mailto:dbsnp-announce%40ncbi.nlm.nih.gov?Subject=Re%3A%20%5Bdbsnp-announce%5D%20dbSNP%20Human%20Build%20142%20%28GRCh38%20and%20GRCh37.p13%29&In-Reply-To=%3Cmailman.111797.1413495923.20652.dbsnp-announce%40ncbi.nlm.nih.gov%3E">
<META NAME="robots" CONTENT="index,nofollow">
<style type="text/css">
pre {
white-space: pre-wrap; /* css-2.1, curent FF, Opera, Safari */
}
</style>
<META http-equiv="Content-Type" content="text/html; charset=us-ascii">
<LINK REL="Next" HREF="000148.html">
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>[dbsnp-announce] dbSNP Human Build 142 (GRCh38 and GRCh37.p13)</H1>
<B>Public announcements regarding dbSNP</B>
<A HREF="mailto:dbsnp-announce%40ncbi.nlm.nih.gov?Subject=Re%3A%20%5Bdbsnp-announce%5D%20dbSNP%20Human%20Build%20142%20%28GRCh38%20and%20GRCh37.p13%29&In-Reply-To=%3Cmailman.111797.1413495923.20652.dbsnp-announce%40ncbi.nlm.nih.gov%3E"
TITLE="[dbsnp-announce] dbSNP Human Build 142 (GRCh38 and GRCh37.p13)">dbsnp-announce at ncbi.nlm.nih.gov
</A><BR>
<I>Thu Oct 16 17:45:22 EDT 2014</I>
<P><UL>
<LI>Next message: <A HREF="000148.html">[dbsnp-announce] FW: Important: Security Measures to Increase; Parking and Transportation Restrictions on Tuesday, December 2
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#147">[ date ]</a>
<a href="thread.html#147">[ thread ]</a>
<a href="subject.html#147">[ subject ]</a>
<a href="author.html#147">[ author ]</a>
</LI>
</UL>
<HR>
<!--beginarticle-->
<PRE>Oct 16, 2014
dbSNP Human Build 142 (GRCh38 and GRCh37.p13)
dbSNP has released human Build 142, which is based on the GRCh38 assembly (<A HREF="http://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/">http://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/</A>), as well as on the GRCh37.p13 assembly. dbSNP remapped all submissions (including 1000 Genomes Phase III data) submitted on GRCh37 to GRCh38, and provided them in this release (See Notes below).
We are providing build 142 on both assemblies as we did for build 141 in order to support those researchers who are still using GRCh37 as well as those that are ready to migrate to GRCh38.
Human build 142 has a total of 112 million refSNPs (RS), which includes 51 million new RS created in build 142 primarily from 1000 Genomes Phase III variants as well as from other large sequencing projects (see Table below). B142 is composed primarily of SNV refSNPs, but also includes indels, deletions, microsatellites, and short complex substitutions.
==================================================================================
Large submissions included in Build 142:
handle ss_cnt rs_cnt
=========== ======= =======
1000Genomes 141719341 85435166
EVA-GONL 20708240 20640754
JMKIDD_LAB 15880481 15601067
==================================================================================
Component Availability Dates:
Component Date Available
dbSNP web query Oct 16, 2014
FTP data Oct 16, 2014
Entrez Search Oct 16, 2014
==================================================================================
BUILD 142 NOTES:
* Build Summary: <A HREF="http://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi?build_id=142">http://www.ncbi.nlm.nih.gov/projects/SNP/snp_summary.cgi?build_id=142</A>
* The 1000 Genomes Phase III release in dbSNP build 142 does not include the ChrX, ChrY, and MT. These chromosomes will be included in a future dbSNP build pending their submission from 1000 Genomes.
* A small list of 1000 Genomes phase III variants (70353 ss) that require updates was submitted after build 142 had started. These variant updates were not included in build 142 as a result of the late submission, so we have posted the list of the variants on the FTP site in this file: (<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/1kg_70K_phase_update.txt.gz">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/1kg_70K_phase_update.txt.gz</A>)
* dbSNP remapped all variants that were submitted on GRCh37 to GRCh38. For your convenience, we have created a list of those RS that failed to map to GRCh38 and have provided their GRCh37 coordinates: (<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_without_mapping_b142.bcp">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_without_mapping_b142.bcp</A>)
* dbSNP remapped to GRCh38 a small number of Submitted SNP (ss) from 1000G Phase III that were submitted on GRCh37. These ss had different positions than those upon which clustering was based, and as a result, we were unable to aggregate the frequency data for 119 rs. Consequently, we will not be reporting the frequency for these 119 rs at the refSNP level. The submission allele frequency for these rs are located in this file: (<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/b142_rs_with_two_GRCh37_positions_from_1000GPhase3.txt">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/b142_rs_with_two_GRCh37_positions_from_1000GPhase3.txt</A>)
* The computations for the following VCF common variant files do not include the 1000 Genomes Phase III population frequencies. These files will be updated and available in approximately two weeks:
<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/</A>
common_all.vcf.gz
common_all_papu.vcf.gz
common_and_clinical-latest.vcf.gz
common_and_clinical-latest_papu.vcf.gz
common_no_known_medical_impact-latest.vcf.gz
common_no_known_medical_impact-latest_papu.vcf.gz
* Because the number of dbSNP index terms has exceeded the Entrez SNP system limit due to dbSNP's rapid growth, we have temporarily dropped the indexing of Submitted SNPs (ss) numbers as a short term solution to this problem. We are currently working to upgrade our search indexing system to restore full ss searches back to Entrez SNP in a future release. In the meantime, users can directly access the dropped ss numbers by using the dbSNP Batch Query Service at: <A HREF="http://www.ncbi.nlm.nih.gov/projects/SNP/batchquery.html,">http://www.ncbi.nlm.nih.gov/projects/SNP/batchquery.html,</A> or by using this cgi: <A HREF="http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ss.cgi?subsnp_id=ss329">http://www.ncbi.nlm.nih.gov/projects/SNP/snp_ss.cgi?subsnp_id=ss329</A>
* There are differences in variation mapping location, gene annotation, and functional consequence between GRCh37.p13 and GRCh38. Reports of these variation mapping and functional consequence differences are available at: <A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/consequence_change/">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/consequence_change/</A>)
* There is a set of co-located refSNPs in B141 that were merged in B142. A list of these co-located refSNPs is available at: (<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_colocated.bcp">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_colocated.bcp</A>)
* There are a number of refSNPs (rs) whose genome orientation flipped (1) between Build 141 and Build 142 on GRCh38 and (2) between GRCh37p13 and GRCh38 in Build 142. The lists of the affected rs numbers are:
1) Between Build 141 and Build 142 on GRCh38 (<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_with_changed_orientation.bcp">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/rs_with_changed_orientation.bcp</A>)
2) Between GRCh37p13 and GRCh38 in Build 142
(<A HREF="ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/b142_rs_GRCh37_to_GRCh38_orientation_flip.txt">ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/misc/known_issues/b142/b142_rs_GRCh37_to_GRCh38_orientation_flip.txt</A>)
==================================================================================
REPORT, SEARCH AND DOWNLOAD NOTES:
RefSNP Reports
The refSNP page will report the positions of variations on both the GRCh38 assembly and the GRCh37.p13 assembly.
Entrez Search
Although B142 maps to both GRCh38 and GRCh37p13, Entrez SNP is indexed with GRCh38 positions and annotations only.
FTP Downloads
The FTP directory structure for B142 is available at: <A HREF="ftp://ftp.ncbi.nih.gov/snp/release-notes/Build142/">ftp://ftp.ncbi.nih.gov/snp/release-notes/Build142/</A> human_ftp_directory_change.txt
=================================================================================
GENERAL NOTES:
* The Variation Reporter (<A HREF="http://www.ncbi.nlm.nih.gov/variation/tools/reporter">http://www.ncbi.nlm.nih.gov/variation/tools/reporter</A>) now supports both GRCh37 and GRCh38 query locations and will be updated to Build 142 in about a week (est. Oct 23, 2014). For more information on the Variation Reporter, see: <A HREF="http://www.ncbi.nlm.nih.gov/variation/tools/reporter/docs/help.">http://www.ncbi.nlm.nih.gov/variation/tools/reporter/docs/help.</A>
* Variation Viewer 1.2 for human is now available (<A HREF="http://www.ncbi.nlm.nih.gov/variation/view">http://www.ncbi.nlm.nih.gov/variation/view</A>) and will be updated to Build 142 in about a week (est. Oct 23, 2014).
o Please see the Variation Viewer release notes for updates (<A HREF="http://www.ncbi.nlm.nih.gov/variation/view/release-notes/">http://www.ncbi.nlm.nih.gov/variation/view/release-notes/</A>). You can get started using Variation Viewer by watching the &quot;NCBI Variation Viewer Introduction&quot; video or by visiting the Variation Viewer FAQ or Help pages. Please contact us at <A HREF="http://www.ncbi.nlm.nih.gov/mailman/listinfo/dbsnp-announce">snp-admin at ncbi.nlm.nih.gov</A> if you have questions or would like to provide feedback.
o Variation Viewer YouTube Video: <A HREF="https://www.youtube.com/watch?v=rnWZ9MFBwUM">https://www.youtube.com/watch?v=rnWZ9MFBwUM</A>
o Variation Viewer Help Page: <A HREF="http://www.ncbi.nlm.nih.gov/variation/view/help/">http://www.ncbi.nlm.nih.gov/variation/view/help/</A>
o Variation Viewer FAQ: <A HREF="http://www.ncbi.nlm.nih.gov/variation/view/faq/">http://www.ncbi.nlm.nih.gov/variation/view/faq/</A>
* The 1000Genomes browser (<A HREF="http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes">http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes</A>) currently displays Phase I data. Once the 1000Genomes project releases the Phase III data (including calls on sex chromosomes and mitochondria), NCBI will work to update the browser to display Phase III data. In the meantime, you can download the autosomal phase III data from the NCBI FTP site (<A HREF="ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/">ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/</A>).
Contact <A HREF="http://www.ncbi.nlm.nih.gov/mailman/listinfo/dbsnp-announce">snp-admin at ncbi.nlm.ncbi.nih.gov</A> if you have any questions or concerns.
dbSNP Production Team
National Center for Biotechnology Information (NCBI)
National Library of Medicine
National Institutes of Health
</PRE>
<!--endarticle-->
<HR>
<P><UL>
<!--threads-->
<LI>Next message: <A HREF="000148.html">[dbsnp-announce] FW: Important: Security Measures to Increase; Parking and Transportation Restrictions on Tuesday, December 2
</A></li>
<LI> <B>Messages sorted by:</B>
<a href="date.html#147">[ date ]</a>
<a href="thread.html#147">[ thread ]</a>
<a href="subject.html#147">[ subject ]</a>
<a href="author.html#147">[ author ]</a>
</LI>
</UL>
<hr>
<a href="http://www.ncbi.nlm.nih.gov/mailman/listinfo/dbsnp-announce">More information about the dbsnp-announce
mailing list</a><br>
</body></html>