dbGaP Submission System Reference for GPAs
Submission System URL: https://dbgap.ncbi.nlm.nih.gov/ss/dbgapss.cgi
For Assistance
dbgap-sp-help@ncbi.nlm.nih.gov - Technical or administrative, the dbGaP submissions help desk
GDS@mail.nih.gov - Policy-related, the Genomic Data Sharing staff at the Office of Science Policy
feolo@ncbi.nlm.nih.gov - Other dbGaP-related, Mike Feolo, dbGaP team lead
Genomic Data Sharing FAQ: https://sharing.nih.gov/faqs#/genomic-data-sharing-policy.htm
SUBMISSION SYSTEM TABS
Once logged into the Submission System, GPAs are directed to their Registered Studies page, with additional tabs for DACs, Staff, and Profile across the top.
Registered Studies
In the Registered Studies list, filters are available for GPA, DAC, Registration Status, and SRA. The list may also be sorted by clicking the column headers.
To export study information, click Report by Study or Report by Consent Group. The reports will include only the studies currently filtered.
DACs
The DACs tab shows the Data Access Committees within your Institute or Center, and any others in which you play a role, such as DAC Chair.
Staff
To add a GPA Administrator, who will have permissions to create and edit studies for you, go to the Staff tab, click Add Another Member, enter their name, click Look Up, click on their name, click Add, enter the Effective Date, and click Save. You will receive the confirmation message "Info: Staff information has been updated successfully." Expiration Date is optional.
To remove a staff member, set the Expiration Date to today's date and click Save.
Profile
Profile information is taken from the NIH Enterprise Directory (NED), but the phone number and position/title fields may be edited.
REGISTRATION STEP ONE
Registering a new study is a two-step process. First, on the Registered Studies tab, click the "Register new study" button near the top, and complete the form. Second, edit the new study to add more details and complete the review.
The first section is Study Registration Information. Note that there is a 255-character limit for the study name. The vast majority of studies will have Study Mode: Normal. A Collection is a virtual study under which other studies are grouped; it has no data of its own. Do NOT mark a study as Collection unless you have discussed with dbGaP Staff in advance. Studies that are not known to be Collections will be marked back to ‘Normal’ Study Mode.
In addition to the Admin IC, other Institutes and Centers may be identified as Admin and/or Funding.
Target data delivery date and Target public release date are optional, to provide information to dbGaP curators. If a target release date is included, you will receive an email reminder on the given date. Dates may come from the Basic Study Information sheet, the data sharing plan, or the PO or PI.
Study Data Types
When 'Data submission to dbGaP is expected' is set to 'Yes', the section will expand so that you will be able to identify the data types. The format has been updated as of September 2021.
Choose 'yes' if the data type will be deposited in NCBI. Choose 'no' if the data type is not applicable, or if it will be deposited only in an External Data Source.
Clicking “yes” will expand some sections to add details.
Sequence data (NGS) has three possible configurations:
- Data will be deposited in NCBI: choose 'yes' for Sequence, with NCBI Storage, and choose 'Yes' for 'Sequence Read Archive submission is expected' below.
- Sequence metadata will be submitted to NCBI, but the data will be deposited in an External Data Source: choose 'yes' for Sequence, with Cloud Data Storage, add an external data source below, choose 'Yes' for 'Sequence Read Archive submission is expected' below, and specify the non-NCBI cloud storage location below.
- Both sequence data and metadata will be deposited in an External Data Source: choose 'no' for Sequence, add an external data source below, and also choose 'No' for 'Sequence Read Archive submission is expected' below.
If “Association Analysis” is set to “yes,” then it will ask if the analyses should be included in the Compilation of Aggregate Genomic Data (CADA), a collection of analyses across many dbGaP studies that can be accessed with a single Data Access Request.
Clicking "Add external data source" will expand the External Data Source section.
External Data Source
Choosing a Name of External DB will auto complete the URL and Help desk fields. Approval telemetry reports of authorized users (whitelists) will be synced with the external data source every six hours until this method can be superseded by RAS.
A common arrangement at this time is for sequence data to be at an external data source while genotype data is deposited in dbGaP, in which case the four questions would be answered Yes - No - Yes - Yes. Other configurations are supported, including no data at dbGaP.
Sequence Read Archive (SRA)
The Sequence Read Archive section describes sequencing data that will be submitted to NCBI, as opposed to an external data source. Also choose 'Yes' if sequence metadata will be submitted to dbGaP, even if the data itself will be deposited in a separate cloud location.
When 'Sequence Read Archive submission is expected' is set to 'Yes', the section will expand. In the past, all data would be stored at NCBI, but now there are cloud options. If Google or Amazon cloud is clicked, you will be asked to provide the emails of the cloud service administrator and data steward. The CSA is the person or group responsible for implementing your IC's cloud policy, security and billing. Your CIO should be able to define who this person or group is for this study/project. The data steward is the study-specific person or group responsible for data integrity, submission of metadata to SRA, and answering questions from users about the data if they arise.
Principal Investigator and Submitter
To add the Principal Investigator, enter the first name, last name, and/or grant number, click 'Look up', and choose the name from the list. A PI may have only one eRA account but multiple listings if they are affiliated with multiple organizations. If the investigator is not found, please email dbgap-sp-help@ncbi.nlm.nih.gov and include the eRA Commons username to be imported into the system. If an account is not found in the system, another option is to register the user with Account Type: Virtual, but this means they will not have access to the study registration.
Check 'I understand that this PI (NED:XXXXXXXXXX) will have streamlined access' in order to grant the PI access to their own data. This is customarily done for all studies. The system will automatically create a project for the PI in Authorized Access. This project will not be provisioned with a DAR, or require SO or DAC approval. It will also not expire.
Enter the PI instution from the Institutional Certification.
It is optional to add a PI Assistant/Submitter. If one is added, they will become the main point of contact for the study and will receive the invitation to the Submission Portal instead of the PI.
If an External Data Source was registered above, then you will have the option to add External Data Submitters. These individuals will have authorization to submit to the external data source -- they may not be the same as those who will submit files directly to dbGaP.
It is not necessary for PIs or Submitters to log into the Submission System, but they may log in if they wish, or if a Review by PI is requested.
To log into the Submission System with an eRA Commons account, enter the username and password under "Authenticator App." Login.gov credentials are not accepted. eRA passwords can be reset here, if needed:
https://public.era.nih.gov/ams/public/accounts/password/reset.era
Certification
Nearly all dbGaP studies are Controlled Access, but a study may be Unrestricted if it has no individual level data. For example, it may have only Association Analysis data.
The current status of the Institutional Certification determines the Submission Certification answer. Note that the fourth option, "Study is prior to the GWAS policy...." is there primarily to cover some older studies and should rarely, if ever, be used for new studies.
Program Officer
NIH-funded studies should have a Program Officer, but otherwise you may check 'No PO is assigned to this study'.
Register Study
Once the form is complete, click the Register Study button. You will get the confirmation message "Info: Study "Study Name" has been created successfully."
REGISTRATION STEP TWO
The newly registered study will be opened for you to complete the second step of the registration. If you wish to complete it later, it will also appear on your Registered Studies list.
Note that the Registration Status will be Incomplete until the Missing items are added. The 'Processing Status: open' link goes to the public Study Status Report.
The links next to 'Missing' will take you to the relevant sections to be updated.
Clicking the 'Grant PI access to this page' button will invite the PI or PI Assistant (if entered) to the Submission System to complete the study registration, but they will not be able to complete all sections. This option is not used in most cases.
Super Admin (SA) Study Management
The SA Study Management section is where you will see curatorial assignments and Super Admin notes.
Policy
The Policy section will appear only if the study is Controlled Access. Click Edit to make some required changes.
By default, the research statement and public summary are displayed, but these may be unchecked.
For non-NIH studies the submitters might want to add email addresses to be notified when a Data Access Request is approved, but this is optional.
Genomic Summary Results Selection
The Genomic Summary Results selection is required. NIH provides GSR from most studies through unrestricted access. However, datasets considered to have particular sensitivities related to individual privacy or potential for group harm may be designated as “sensitive” by the submitting institution. GSR from any such data sets will only be available through controlled access.
Data Use Certification (DUC)
The Universal Data Use Certification is used for most studies, but you may uncheck the selection and upload a custom DUC in the files section below.
Update the Acknowledgment Statement to provide specific points that should be included in an acknowledgement, such as sources of support and collaborators who have made subjects or samples available. Consider citing a specific publication that comprehensively describes the origin of the dataset. The text of this field will be combined with the NIH Model DUC in an automatically generated template file.
The IC Specific Term of Data Access section is for terms of data access specific to the sponsoring Institute or Center but not included in the universal DUC. The text of this field will be combined with the universal DUC in an automatically generated template file.
Institutional Certification, Study Description
Uploading a Submission Certification (Institutional Certification) is required. New Institutional Certification forms are available at https://sharing.nih.gov/genomic-data-sharing-policy/institutional-certifications/completing-an-institutional-certification-form. Please ask the data submitter to transition to the use of these forms, if possible.
A Collaborative Agreement is not required, but it is recommended if the Data Use Limitations include Collaboration Required (COL).
Study Description is the Basic Study Information Form, which is not a required file, but may be helpful for future reference, especially if a different GPA may be assigned at some point.
On clicking Update study, you will get the confirmation message “Info: Study has been updated successfully.”
If the Institutional Certification form was electronically signed and includes Data Use Limitations, then uploading it to the system will partially populate the Consent Groups section. See notes below regarding this feature.
Consent Groups
The next Missing item should be 'Consent Group'. Go to the Consent Groups section and click Edit.
If the Institutional Certification form was electronically signed and included Data Use Limitations, then it should have partially populated the Consent Groups section. Note that:
- The DAC still needs to be chosen manually for each consent group.
- "Disease-Specific" consents: the disease must still be chosen from the Disease/Trait/Exposure list, which will update the consent abbreviation.
- "Disease-Specific" consents: the "Related disorders" box must be checked manually, if applicable.
- "Other" consents: the consent group title is filled automatically, but the abbreviation and DULs must be entered manually.
- "Other" consents: DUL modifiers need to be entered manually in the consent abbreviation, title, and DULs fields.
- There is a known issue where the GSO modifier is not automatically filled. To update the text, uncheck and recheck the Genetic Studies Only box.
- Carefully check the Consent Groups section against the DULs page to ensure that no errors were introduced.
All individual-level data submitted to dbGaP should have a consent group of at least GRU. This applies to all samples collected at any time.
For each Consent Group, select one Consent Group Type: General Research Use (GRU), Health/Medical/Biomedical (HMB), Disease-Specific (DS), Exchange Area (EA), or Other. Then select any consent group modifiers that apply.
If the type is Disease-Specific, a Disease/Trait/Exposure must be chosen. If the disease is not in the list, you may add it by clicking "Diseases section" (all current data in the form will be lost).
Once the consent group has been specified, standard text will be generated inside the "Data Use Limitations" box. Save the consent group if the autogenerated data use limitation is accurate. If modifications need to be made to the data use limitation, select "Other" as your consent group and draft your own data use limitations.
Note: "Other" may be selected when it is definitive that no standardized consent group and modifier listed above can be used as the data use limitation of a study. "Other" is not an official designation and should not be used as the Consent Group Title or Abbreviation. The GPA and PI should determine a Consent Group Title and Abbreviation that best represents the data use limitation. Since Abbreviations are used in file names, and file names have character limits, please choose a concise Abbreviation.
DAC must be chosen from the drop-down menu for each consent group.
Click "Update study" after adding each consent group. If a new disease needs to be added, all current data in the form will be lost.
The Registration Status should now be Awaiting GPA's Approval.
Streamlined Access
Additional investigators may be granted streamlined access by editing the 'Investigators with streamlined access' section. The system will automatically create a project for the submitting investigators in Authorized Access. This project will not be provisioned with a DAR, or require SO or DAC approval. It will also not expire.
Important note: if the study has ever previously been Completed by GPA, then after adding new accounts with streamlined access you must again click the Apply button next to the ‘GPA review’ section, described below. The new streamlined access projects will appear in AA the following day.
GPA Review
To complete the registration, check all boxes in GPA Review and click Apply. The Registration Status in the upper right corner will change to Completed by GPA and the PI or PI Assistant will be invited to the dbGaP Submission Portal.
The "Register sub-study" button applies only to cohort studies with parent-child study configurations. Please contact dbgap-sp-help@ncbi.nlm.nih.gov prior to creating substudies.
To create a new substudy for a released parent-child study, please have the study submitter fill out the Study Data Outline (SDO) in the Submission Portal for the parent study. Once the SDO is completed, a new version is created for the parent and all substudies in the Submission Portal and the Submission System. The GPA will then see the "Register sub-study" button.
FAQ AND TROUBLESHOOTING
Submission Portal
- GPAs may log in to the dbGaP Submission Portal to see their studies by using the "NIH Account" button.
https://submit.ncbi.nlm.nih.gov/dbgap/
Who is invited to the dbGaP Submission Portal?
-
If no PI Assistant/Submitter is registered, then the PI will be invited to the Submission Portal.
-
If a PI Assistant/Submitter is registered, then the assistant will be invited instead of the PI.
-
To invite the PI as well, or any other submitters, write to dbgap-sp-help@ncbi.nlm.nih.gov.
-
Note that when the assistant is invited, the PI is CCed on the email, so the PI will sometimes co-opt the assistant's invitation code.
-
If the PI or PI Assistant is changed after the study status has already been Completed by GPA, then invitations will not be sent. Email dbgap-sp-help@ncbi.nlm.nih.gov to request Submission Portal invitations for the new PI or PI Assistant.
How do I know who is registered to submit in the Submission Portal?
-
The Submitter(s) section of the study status report shows who has accepted an invitation to the Submission Portal, e.g. https://www.ncbi.nlm.nih.gov/gap/study/status/13212.
-
dbGaP developers are also working on making the submitters section visible when GPAs log into a study in the Submission Portal.
Virtual accounts vs. ERA/NIH
-
When registering the PI and PI Assistant, there are two Account Types: eRA/NIH or Virtual. Only the eRA/NIH account type can log into the registration system to view the study.
-
If an account is not coming up in the lookup, you can request for its eRA Commons username to be imported by emailing dbgap-sp-help@ncbi.nlm.nih.gov.
Unrestricted access vs. GSR
-
"Unrestricted access" applies to the entire study. This should rarely be chosen, as nearly all dbGaP studies are Controlled Access.
-
"Unrestricted GSR" means that the Genomic Summary Results may be shared publicly, which applies to most studies.
GPA Administrators
- GPA Administrators have the same access and permissions as their GPAs, so that they may help with study registrations. To add a GPA Admin, go to the Staff tab in the Submission System. The Expiration Date is optional. To remove a staff member, set the Expiration Date to today's date.
Where did the Policy section go?
- If there is no Policy section, it is likely because the registration is configured so that no data are expected or the certification is set to unrestricted access, rather than controlled access. An Institutional Certification is not expected in this case.
Missing DAC for Consent Group
- Each consent group must have a DAC assigned. If any are omitted, then the main study registration page will show the error "Missing: DAC for Consent Group" in the upper right hand corner.
Unlocking
- When a study is Completed by GPA or Released, the GPA is still able to edit all fields except for Consent Groups and Admin IC. Clicking (Unlock) will send a request to the study curator to allow these changes to be made. Generally these should not be changed in a released version, but in a new version of the study.
External Data Submitters vs. Submission Portal Submitters
- External Data Submitters are listed in the study registration so that they may submit data to the external data source, e.g. GDC. They should not be invited to the dbGaP Submission Portal unless they are also submitting the dbGaP portion, such as consent metadata.
dbGaP Submission Guides
-
How to Submit chart - https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetPdf.cgi?document_name=HowToSubmit.pdf
-
dbGaP Study Submission Guide - https://www.ncbi.nlm.nih.gov/gap/docs/submissionguide/
-
dbGaP Molecular Data Submission Guide - https://www.ncbi.nlm.nih.gov/gap/docs/moleculardatasection/
-
dbGaP Special Studies Submission Guide - https://www.ncbi.nlm.nih.gov/gap/docs/specialstudies/
-
File templates - https://ftp.ncbi.nlm.nih.gov/dbgap/dbGaP_Submission_Guide_Templates/
Study Versions
-
After a study is released a new version can be created in order to add data, update consent groups, or make other changes to the submission. To create a new version, ask the submitter to update the Study Data Outline in the Submission Portal, which will automatically create the new study version in the Submission System.
https://submit.ncbi.nlm.nih.gov/dbgap/ -
If the submitter has trouble accessing the Submission Portal, or if the study does not exist in the portal, write to dbgap-sp-help@ncbi.nlm.nih.gov for assistance.
-
The new study version will have the status Awaiting GPA's Approval, which will need to be updated to Completed by GPA before it can be released.
Deleting a Study
- If a study registration needs to be deleted, contact Mike Feolo at feolo@ncbi.nlm.nih.gov.
Streamlined access project is missing in AA
- If the study has ever previously been Completed by GPA, then after adding new accounts with streamlined access you must again click the Apply button next to the ‘GPA review’ section. The new streamlined access projects will appear in AA the following day.
Registration Statuses
- Incomplete - This status denotes that the registration in the SS is missing key items, such as names, emails, Institutional Certifications (ICs), consents, acknowledgements, etc.
- Awaiting GPA's Approval - This status denotes that the registration in the SS has been filled out, but not yet approved by the GPA. When version 2 and later are created, the system is automatically set to "Awaiting GPA's Approval." The GPA can modify or accept existing entries. Changes to consents can significantly delay the study release.
- Review by PI - This status denotes that the registration in the SS is awaiting review by the PI.
- Completed by GPA - This status denotes that the registration in the SS has been completed by the GPA. For processing, the admin IC and consents must be finalized. Changes to the consents once processing begins can significantly delay the study release.
- Deleted - This status denotes that the study was once registered in the SS but now no longer should be processed for release.
- Release Postponed - For substudies that are registered but not to be released with the current version of Parent-Child studies, the status will be marked postponed. We strongly suggest not registering substudies ahead, but to register them when they can be released in the parent version that they were registered in.
- Released - A released study is available on public pages, through Authorized Access (AA), and public FTP sites.
- Suspended - A study may be suspended temporarily from AA after study release. The study is not available for PIs, but DAC members still can see existing DARs for this study and make approvals/rejections.
- Withdrawn - A study may be withdrawn permanently from AA after study release. The study and its Data Access Requests (DARs) are not available, except for admins in some interfaces.