Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

dbGaP Automated Validation Error Codes

Severity Error Code Description Rule
Error E0001_FileNotFound The input file ({{ file_name }}) is not found.
Error E0002_File_Not_Readable Unable to open the input file {{ ds_file_name }}.
Error E0003_File_Empty File was found empty
Warning E0004_Unsupported_File_Extensions_Detected The uploaded file extension is not automatically processed and will require manual intervention. The supported file extensions for automated processing are: txt, csv, dat, xlsx
Warning E0005_Multiple Phenotype Files Detected Multiple Pheno files are detected. The uploaded files are not automatically processed and will require manual intervention. Processing of multiple submitted PHENO files require manual intervention
Error E0006_Incomplete_Plink_Set Description: PLINK set incomplete. All PLINK file submissions MUST include a complete set of .bim, .fam and .bed files. PLINK sets are processed as a triplet of files .bim, .fam and .bed.
Error E0101_Blank_Id_Value Missing IDs are not allowed in primary subject ID and sample ID columns. Add missing IDs into column or remove the row. Empty value is not allowed in the Subject ID and Sample ID columns.
Error E0102_Duplicated_Id IDs are duplicated. Each person should only have a single subject ID; each sample ID should be represented in a single row. Remove repeating IDs. Duplicate values are not allowed in the Sample ID or in the Subject ID columns of the DS files (except for longitudinal datasets). For parent SSM, the sample ID should be unique within each substudies.
Error E0103_Invalid_Id_Value Remove invalid characters. Only the following characters can be included in the ID: English letters, Arabic numerals, period (.), hyphen (-), underscore (_), at symbol (@), and the pound sign (#). The characters other than a-z, A-Z, 0-9, ., -, @, and # are not allowed in the Subject_ID and Sample_ID columns.
Error E0104_Invalid_Consent_Value The CONSENT column can only have integer values '>=0'. Invalid consent values: If the Consent column is numerical, then the Consent column is allowed to have only integer values.
Error E0105_StudyDirectoryNotFound Study directory not found. Error finding directory
Error E0106_MissingSubjectConsentFile Subject Consent file is missing. Error finding Subject Consent file
Error E0107_MissingSubjectSampleMappingFile Subject Sample Mapping file is missing. Error finding Subject Sample Mapping file
Error E0108_MissingMetaDataFile Metadata file is missing. Error finding Metadata file
Error E0109_Subject_IDs_Mismatch_SSM2SC The SSM subject IDs are not found in the Subject Consent DS. Therefore, the subjects in the SSM DS are not consented. Either remove the row with the unconsented subject from the SSM DS or add the unconsented subject to the Subject Consent DS with valid consent. SSM ids are not a subset of subject consent ids.
Error E0110_Column_Name_Mismatch_SSM2SC The Subject ID variable names are different in the Subject Consent and SSM files. They must be the same. SC subject id column and SSM subject id column must match.
Error E0111_Multiple_NonBlank_Sheets Excel file has multiple sheets. Fix by submitting one sheet per file. The excel file should have only single worksheet in DD or DS file.
Error E0112_Incorrect_FileType_Detected_DS File is marked as a Dataset (DS), but does not look like a DS; there are no ID columns. Dataset file is expected to have correct file type.
Error E0113_Incorrect_FileType_Detected_DD File is marked as a Data Dictionary (DD), but does not look like a DD; there is no VARNAME. Data Dictionary file is expected to have correct file type.
Error E0114_Duplicate_Variable_Names Variable names must be unique. Change one of the variable names or remove the entire variable if redundant. Variable names should be unique in data dictionary file.
Error E0116_Duplicate_Rows_In_DD Duplicate rows found in data dictionary file. Each row in data dictionary should be unique. Remove duplicates and replace file. Duplicate rows not allowed in data dictionary file.
Error E0117_Duplicate_Rows_In_DS Duplicate rows found in dataset file. Each row in dataset should be unique. Remove duplicates and replace file. Duplicate rows not allowed in dataset file.
Error E0118_Missing_VARDESC_Column Data dictionary file missing required VARDESC column. VARDESC column is required in data dictionary file.
Error E0119_Blank_VARDESC All variables must have a value for VARDESC. VARDESC column value cannot be blank in data dictionary file.
Error E0120_Blank_VARNAME All variables must have a value for VARNAME. VARNAME column value cannot be blank in data dictionary file.
Error E0121_Non_ASCII_Characters Remove non ascii characters from dataset and data dictionary files. Non ASCII Characters not allowed in dataset and data dictionary file.
Warning E0122_Unique_Keys_Detected More than one unique key detected. Found more than one unique keys in data dictionary file.
Error E0123_Unmatched_Variables_In_DS All variables in DS must be found in DD. Either 1) Add variable in DD or 2) Remove the variable column in DS. Found unmatched variables in dataset file.
Warning E0124_Unmatched_Variables_In_DD DD has variables not found in DS. Add variables in DS, if needed. Found unmatched variables in data dictionary file.
Error E0125_Subject_ID_Mismatch Unconsented subjects or subjects with CONSENT=0 have been found in the Subject Phenotypes DS. Either 1) remove the subject IDs with invalid consent from the Subject Phenotypes DS or 2) add the subject IDs to the Subject Consent DS with valid consent. Subject IDs found in Subject Phenotype dataset file must be found in Subject Consent dataset file.
Error E0126_Sample_ID_Mismatch Sample IDs belonging to unconsented subjects or subjects with CONSENT=0 have been found in the Sample Attributes DS. Either 1) remove the sample IDs belonging to subjects with invalid consents or 2) add the sample ID to the Subject Sample Mapping (SSM) DS by mapping to a subject ID with valid consent. Sample IDs found in Sample Attribute dataset file must be found in Subject Sample Mapping dataset file.
Error E0127_Blank_Row Blank row detected in DD. Remove blank rows. Blank rows not allowed in middle of data in data dictionary files.
Error E0128_Invalid_Characters VARNAME column values were found with invalid characters. Invalid characters like '' or 'dbGaP' not allowed in data dictionary files.
Error E0129_Non_Unicode_Characters Non Unicode characters not allowed in data dictionary file. Non unicode characters not allowed in data dictionary files.
Error E0130_MissingSubjectSampleMappingDataDictionaryFile Subject Sample Mapping Data Dictionary file is missing. Error finding Subject Sample Mapping Data Dictionary file
Error E0131_MissingSubjectPhenotypeDataDictionaryFile Subject Phenotype Data Dictionary file is missing. Error finding Subject Phenotype Data Dictionary file
Error E0132_MissingSubjectPhenotypeDatasetFile Subject Phenotype Dataset file is missing. Error finding Subject Phenotype Dataset file
Error E0133_MissingSampleAttributeDatasetFile Sample Attribute Dataset file is missing. Error finding Sample Attribute Dataset file
Error E0134_MissingSampleAttributeDataDictionaryFile Sample Attribute Data Dictionary file is missing. Error finding Sample Attribute Data Dictionary file
Error E0135_MissingPedigreeDataDictionaryFile Pedigree Data Dictionary file is missing. Error finding Pedigree Data Dictionary file
Error E0136_Consent_Missing_In_SC_DS Consent code is found in the Subject Consent DD, but not used in the Subject Consent DS. Remove unused code from the DD OR add subjects with the missing code in the DS. All consents being used should be registered in the dbGaP Submission System. The consent codes listed in Subject Consent DS must match the consents codes in the values columns of Subject Consent DD.
Error E0137_Consent_Missing_In_SC_DD Consent code is found in the Subject Consent DS, but is not coded in the Subject Consent DD. Add the missing consent codes and values to the DD OR remove the unconsented subjects OR change the consent in the DS to a valid value. All consents being used should be registered in the dbGaP Submission System. The consent codes listed in Subject Consent DS must match the consents codes in the VALUES columns of Subject Consent DD.
Error E0138_Consent_Mismatch_SC_DD_vs_RS Consents in the DD do not match the consents registered in the dbGaP Submission System (SS). Correct the DD or contact your GPA if the consents in the SS need to be updated. IMPORTANT: If the consent is not registered as you expect, please contact your Genomic Program Administrator (GPA) to change the consent. GPA:{{ gpa_email_list }} The consents listed in Subject Consent DD must match the consents registered for the study.
Error E0139_Duplicated_ID_In_Subject_Consent IDs are duplicated. Each person should only have a single subject ID represented in a single row. Remove repeating IDs. Duplicate values are not allowed in the Subject ID column of the Subject Consent DS file.
Error E0202_Sample_Id_Mismatch_Geno2SSM Sample IDs in genotype dataset are not found in the SSM DS. These samples do not belong to consented subjects. Either 1) add sample IDs to the SSM DS and Sample Attributes DS or 2) remove sample IDs from genotype dataset. Sample IDs are used in the genotype file
Error E0203_Subject_Id_Used_In_Geno_Files The IDs in genotype dataset match the subject IDs and not sample IDs in the SSM DS. If subject ID and sample ID are same, repeat the subject ID in the sample ID column of the SSM DS. Subject IDs are used in the genotype file
Error E0208_MissingSubjectConsentDataDictionaryFile Subject Consent Data Dictionary file is missing. Error finding Subject Consent Data Dictionary file
Error E0521_Pedigree_Parent_Cross_Subject Parent(s) from the Pedigree file are missing from the Subject Consent Pedigree's parent must be listed in the Subject Consent
Error E0506_Pedigree_Subject_Duplicates Pedigree subject IDs are duplicated. Remove duplicated subject IDs so that each subject ID appears once. Values in SUBJECT_ID column are unique.
Error E0507_Pedigree_Parent_Not_Subject Father or mother IDs are not found in the subject ID column. Add missing father or mother IDs in the subject ID column of the Pedigree DS and include family ID and sex on the same row. Also, verify that the father and mother IDs are also included in the subject ID column of the Subject Consent DS with consent > 0 if consented or consent = 0 if parent is a linking member and not consented. All values in FATHER and MOTHER columns should be listed in SUBJECT_ID column.
Error E0508_Pedigree_Columns_Count Missing required column(s). The required columns in the Pedigree DS are family ID, subject ID, father, mother, and sex. MZ twin IDis required if there are monozygotic twins. Include the columns family ID, subject ID, father, mother, sex, and the optional MZ_TWIN_ID only if the file contains monozygotic twins.
Error E0509_Pedigree_Nulls Family IDs or subject IDs are missing. IDs are required. Columns 1 and 2 should not have null values. Other columns may.
Error E0510_Rectangular_File Dataset is not rectangular. Remove extra values. The numbers of cells on each row of the file should match the number of column headers in the file.
Error E0511_Ped_SC_Subject_Value Subject IDs in Pedigree file DS do not exist in Subject Consent DS. Add the subject ID to Subject Consent DS with valid consent value. Use consent 0 if this subject is for linking subjects in the pedigree. All values in <SUBJECT ID> column of Pedigree File should be listed in <SUBJECT_ID> column of Subject Consent File
Error E0512_Ped_SC_SubjectColumn_Name The Pedigree subject ID variable name in the Pedigree files do not match the Subject Consent files. Use the same variable name for subject IDs throughout all the datasets. The header name of <SUBJECT ID> column of Pedigree File should match header name to <SUBJECT_ID> column of Subject Consent File.
Error E0513_Pedigree_Twin_Values MZ twin IDs are listed once. Add the same MZ twin ID for the related twin or remove the single MZ twin ID. Monozygotic twins and multiples should be assigned the same MZ_TWIN_ID but different SUBJECT_IDs. For dizygotic twins and all other individuals, this column should be left blank.
Error E0514_Pedigree_Twin_Parents Twins are listed with different parents. Change the parent IDs so that they are the same for MZ twins or remove the MZ twin ID designation. Monozygotic twins and multiples should be assigned the same MZ_TWIN_ID but different SUBJECT_IDs and have the same Mother and Father. Please correct and resubmit file.
Error E0515_Pedigree_Twin_Sex MZ twins are listed with different sex values. Change the sex value for one of the twins or remove the MZ twin ID designation. Monozygotic twins and multiples should be assigned the same MZ_TWIN_ID but different SUBJECT_IDs and have the same sex values. Please correct the file and resubmit.
Error E0516_Pedigree_Twin_Parents_Missing Parent IDs are missing for twins. These subjects have been marked as MZ twins. Mother and father IDs cannot be 0 or blank for siblings. Create dummy parent IDs if no ID is available. Also include mother and father IDs under the subject ID columns of both the Pedigree DS and Subject Consent DS. Monozygotic twins and multiples should be assigned the same MZ_TWIN_ID but different SUBJECT_IDs and have the same Mother and Father. The Mother and Father column cannot be 0 or blank for subjects that have a TWIN_ID. Please create Dummy Ids as necessary and resubmit file.
Error E0517_Pedigree_Sex_Value Acceptable sex values are M/Male/1 or F/Female/2 or UNK/Unknown/NULL. Correct sex values. The values of the sex column are expected to be M/Male/1 or F/Female/2 or UNK/Unknown/NULL. Please see pedigree section of submission guide.
Error E0518_Pedigree_Parent_Sex Male subjects are found in the mother column or Female subjects are found in the father column. Correct sex inconsistencies. Please enter the correct sex for the all subjects listed in parents columns or reassign relationships.
Error E0519_Sex_Value_Mismatch_Ped2SC The sex values between the Subject Consent DS and the Pedigree DS are not consistent. Make all sex values consistent between all datasets that have sex values for each subject ID. If there are a large number of discrepancies, this often indicates that the father and mother columns of the Pedigree DS are swapped. The Pedigree file and Subject Consent file have mismatched sex values.
Error E0520_Non_Rectangular_File Data Dictionary file is not rectangular and has no VALUES column. If the file is non-rectangular, it must have a VALUES column
Error E0601_Genotype_Discordance Sample IDs that belong to the same subject ID have different genotypes. Verify ID mapping in SSM DS with genotype sets. Either 1) update subject ID, or 2) remove inconsistent sample IDs, or 3) fix genotype file. EXPECTED DUPLICATES: Samples from the same subject or from MZ twins are expected to have identical or near-identical genotypes.
Error E0602_Genotype_Cryptic_Duplicate Samples with identical genotypes are linked to different subjects and are not declared to be MZ twins in the Pedigree DS. Either 1) remove unexpected duplicates from the SSM, or 2) link subjects with identical genotypes to the same MZ twin ID in the Pedigree DS if they are MZ twins, or 3) Update the subject ID in the SSM so that the sample IDs point to the same subject. UNEXPECTED DUPLICATES: Genotypes from different subjects are expected to reflect subjects’ relatedness (unrelated/1st degree relations/etc ).
Report E0603_GRAF_Relationship_Report Relationship file with all pairwise relationships (first and second degree) computed from genotypes: {{ attachment_file }}. Graf Relationship file was created.
Error E0701_Subject_Consent_Sex_Value Sex column in the Subject Consent {{ file_name }} column {{ sex_col }} has values that are not consistent the submission guide. Please correct this column to have the allowed values [M/Male/1 or F/Female/2 or UNK/Unknown/NULL] and resubmit. The values of the sex column are expected to be [M/Male/1 or F/Female/2 or UNK/Unknown/NULL]. Please see subject consent section of submission guide.
Warning E0702_Subject_Consent_Sex_Column_Missing The Subject Consent file {{ file_name }} does not have the sex column included as the third column. Please add this column with the values [M/Male/1 or F/Female/2 or UNK/Unknown/NULL]. The Subject Consent file does not have an included sex column
Report E0703_GRAF_Sex_Distribution GRAF Sex Distribution file for distribution of sex values in phenotype file(s) and genotype files(s): {{ attachment_file }} GRAF Sex report file was created.
Error E0704_GRAF_Sex_Discordance GRAF Sex Discordance detected. If the samples are known to have chromosomal anomalies or you have information that this error is not consistent with your analysis, you may provide a README file explaining the sex errors. Upload the README under 'Other files' with type 'Molecular Data' in the Submission Portal. Otherwise, correct the reported sex so that it is consistent with the genotypes. Inconsistency between reported sex of the sample and sex inferred from the genotype data.
Warning E0705_GRAF_Insufficient_XMarkers Genotype files lack sufficient X chromosome markers for dbGaP to determine subject's sex or check concordance between phenotype file(s) and genotype file(s). dbGaP will continue processing unless you inform us that this is not expected. Not enough information to create the graf sex output file.
Error E0801_Design_Description_Too_Short Design Description requires {{ MIN_DESCR_SIZE }} characters minimally. Design description should be more than {{ MIN_DESCR_SIZE}} characters.
Error E0802_Not_In_Controlled_Vocab The column should have values from the controlled vocabulary list. Look at Terms tab in the template file for allowed controlled vocabulary terms. Values for this column should only contain allowed controlled vocabulary terms.
Error E0803_Duplicate_Rows_In_Sequence_Metadata Duplicate rows found in sequence metadata file. Each row in sequence metadata should be unique. Remove duplicates and replace file. Duplicate rows not allowed in sequence metadata file.
Error E0804_Invalid_Checksum_Format MD5 checksum does not have either allowed characters or has incorrect length. MD5 Checksum format should conform to the 32 bit hexadecimal string.
Error E0805_Missing_Value All required columns in the template are REQUIRED and should have values. All required columns in the template are REQUIRED and should have values.
Error E0806_Missing_Column REQUIRED columns missing from the sequence metadata file. One or more REQUIRED columns are missing on the sequence_data sheet of the sequence metadata file.
Error E0807_Duplicate_Values Multiple rows found with repeated checksum, file name or library_id values. One way to make the library_id value unique is to add the sample_id value to the end of library_id Each row must have a unique value for filename, checksum and library_id columns.
Error E0808_Sample_Not_Found Sample ID(s) in sequence metadata file was not found in Subject Sample Mapping file. Sample IDs in metadata file must be a subset of Sample IDs found in Subject Sample Mapping file.
Error E0809_Accession_Mismatch Study accession mismatch in submitted file. The study accession (phs######) in file must match exactly to the study for which the file is uploaded.
Warning E0810_Duplicate_Sample_Library_info Multiple rows found with same strategy, source and selection library info for the same sample ID. The study accession (phs######) in file must match exactly to the study for which the file is uploaded.
Report E0811_Sequence_Metadata_Load_Report Sequence Metadata file successfully loaded. Refer to attached file for more details: {{ attachment_file }} SRA metadata telemetry report created.
Error E0812_Sequence_Metadata_Load_Error Loading of the sequence metadata file failed. Refer to attached file for more details: {{ attachment_file }}. Fix errors and resubmit SRA metadata telemetry report created.
Error E0813_Invalid_Characters_In_Sequence_Metadata Invalid characters found in sequence metadata file. Invalid characters not allowed in sequence metadata file.
Error E0814_Sequence_Data_Files_Load_Error Loading of data files reported in sequence metadata failed. Fix errors and resubmit. Refer to attached file for more details: {{ attachment_file }}. Sequence data files are loaded.
Error E0815_Missing_Worksheet_From_Sequence_Metadata The submitted sequence metadata excel file is missing the {{ sheet_name }} worksheet Sequence metadata excel file must contain a worksheet named '{{ sheet_name }}'
Error E0816_DUPLICATED_SEQUENCE_DATA_FILES These files have been submitted previously. Each sequence data file (filename and md5) can only be submitted once for each study (including all versions). Fix errors and resubmit. The same sequence data file can be submitted only once for each study (including all versions).