Biosample Validation Errors
BioSample submissions are subject to validation in order to ensure that the records contain accurate and useful information consistent with the FAIR principles. BioSample validations are performed automatically during the submission process and any identified errors will be displayed on the screen immediately upon detection. Errors are displayed with accompanying text to explain why the error occurred and how to correct it. BioSample accessions will not be assigned until all errors have been corrected. Please contact biosamplehelp@ncbi.nlm.nih.gov if you need assistance.
Error code | Error name | Message | Comment |
---|---|---|---|
1 | long_namespace | Value '$SPUID_NAMESPACE$' is too long, maximum length allowed is 100 characters. | |
2 | long_id | Value '$SPUID$' is too long, maximum length allowed is 128 characters. | |
3 | missing_sample_name | Required Sample Name is missing for $COUNT$ sample(s). | Applies to batch submissions only. |
4 | multiple_primary_ids | Sample has more than one primary Identifier. | Applies to XML deposit submissions only. |
5 | missing_organism | Required Sample organism is missing. | |
6 | identical_samples | These samples have the same Sample Names and identical attributes. If these are duplicates, please delete one. If they really are 2 different samples, provide a unique Sample Name for each. | New submission duplicates existing record from the same spuid_namespace by SPUID and Attributes. |
7 | identical_sample_names | These samples have the same Sample Names and different attributes. If they are different samples, please provide a unique Sample Name for each. If one is intended to be an update of the other, please stop this submission and contact biosamplehelp@ncbi.nlm.nih.gov. | New submission has the same SPUID as existing record from the same spuid_namespace, but has different Attributes. |
9 | large_submission | Large submission requires curator review. Please write to biosamplehelp@ncbi.nlm.nih.gov to request review of your submission. | Applies to batch submissions only. The error prevents batch submissions larger than 1000 samples. |
10 | empty_batch_submission | Batch table is empty. | Applies to batch submissions only. |
11 | identical_attributes | Your table upload failed because multiple BioSamples cannot have identical attributes. You should have one BioSample for each specimen, and each of your BioSamples must have differentiating information (excluding sample name, title, bioproject accession and description). This check was implemented to encourage submitters to include distinguishing information in their samples. If the distinguishing information is in the sample name, title or description, please recode it into an appropriate attribute, either one of the predefined attributes or a custom attribute you define. If it is necessary to represent true biological replicates as separate BioSamples, you might add an 'aliquot' or 'replicate' attribute, e.g., 'replicate = biological replicate 1', as appropriate. Note that multiple assay types, e.g., RNA-seq and ChIP-seq data may reference the same BioSample if appropriate. | Applies to batch submissions only. |
12 | missing_package | Package information is missing. | |
13 | parser_duplicate_column | Only one '$COLUMN_NAME$' column is allowed in the table. | Applies to batch submissions only. $COLUMN_NAME$ is one of sample_name, sample_title, or organism. |
14 | unknown_package | Sample refers to unknown Package '$PACKAGE$'. | |
15 | missing_mandatory_attribute | Sample has missing mandatory attribute(s). If you do not have information for the required field(s), please provide the value as either 'missing', 'not applicable', 'not collected', 'not provided' or 'restricted access'. | |
16 | unsupported_schema_version | BioSample xml refers to unsupported schema_version="$SCHEMA_VERSION$". | Applies to XML deposit submissions only. |
17 | duplicate_sample_names | The following Sample Names were used in submission more than one time: $NAMES$. Please provide unique Sample Names. | Applies to batch submissions only. Duplicate Sample Names were detected in batch submission file. |
18 | parser_cannot_tokenize | Unable to parse batch file: quoted text possibly split across multiple lines. | Applies to batch submissions only. Cannot parse batch submission file into proper tokens. One possible reason could be unclosed quotes. |
19 | parser_non_ascii_header | Unable to parse batch file: the header line contains non-ASCII characters. Please check that uploaded file is valid Excel or text tab-delimited. | Applies to batch submissions only. |
20 | invalid_attribute_value | These attribute values are not valid so please correct the data. Refer to https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ for required format descriptions. If you do not have information for the required field(s), please provide the value as either 'missing', 'not applicable', 'not collected', 'not provided' or 'restricted access'. | Few most important Attributes have particular requirements about their value format. These Attributes include: biological material, collection date, culture collection, geographic location, host, host sex, latitude and longitude, sex, specimen-voucher, and some other. Please refer to https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ for required format descriptions. Examples of incorrectly formatted values: collection date = "01/04/06"; geographic location = "FL", host = "Human B-Cells". |
21 | missing_namespace | Sample Identifier is missing namespace. | Applies to XML deposit submissions only. |
22 | missing_id | Empty Sample Identifier. | Applies to XML deposit submissions only. |
23 | missing_attribute_name | Attribute name is missing. | |
24 | missing_attribute_value | Empty attribute value for attribute '$ATTRIBUTE_NAME$'. | |
25 | missing_either_one_attribute | Sample has missing attribute(s), at least one the the following $GROUP_NAME$ attributes is required. If you do not have information for the required field(s), please provide the value as either 'missing', 'not applicable', 'not collected', 'not provided' or 'restricted access'. | Example of the error. Several of BioSample Packages require to provide at least one of Organism Attributes, for example isolate or strain. This error will occur if none are given. |
26 | parser_empty_col_name | Unable to parse batch file: the header line has empty column name. | Applies to batch submissions only. |
27 | parser_non_utf8 | Unable to parse batch file: some values cannot be converted to UTF8 encoding. | Applies to batch submissions only. |
28 | autofix_attribute_value | We will automatically transform the attribute value(s) you provided as follows. | This is a warning that does not require submitter to fix submission data. For few important Attributes mentioned in Error "Invalid Attribute value", we may be able to generate automated correction that will satisfy format requirements. This "autofix" value will replace submitted value. Example of autofixes. Submitted collection date = " August 23 2012" will changed to "2012-08-23". Submitterd host="human" will be changed to "Homo sapiens". Submitted sex="M" will be changed to "male". |
29 | collection_date_in_the_future | Sample collection date is a future date, please specify a date from the past. | |
30 | latlon_vs_country | Values provided for 'latitude and longitude' and 'geographic location' contradict each other: $MESSAGE$. | Example is when country is USA, and lat-lon coordinates are from the point in Europe. MESSAGE will provide additional custom details. |
31 | contaminated_cell_line_warning | $MESSAGE$ | This is a warning that does not require submitter to fix submission data. Custom MESSAGE will inform about potential contamination of particular cell line in given Organism. See http://iclac.org/databases/cross-contaminations/ for more information and references. |
32 | taxonomy_revised_warning | Provided taxonomy information was revised according to NCBI Taxonomy database rules. Please contact biosamplehelp@ncbi.nlm.nih.gov if you have any questions. | This is a warning that does not require submitter to fix submission data. |
33 | tax_consult_warning | Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information could not be resolved automatically and will require a taxonomy consult. For more information about providing a valid organism, including new species, metagenomes (microbiomes) and metagenome-assembled genomes, see https://www.ncbi.nlm.nih.gov/biosample/docs/organism/. | This is a warning that does not require submitter to fix submission data. |
34 | taxonomy_error_warning | Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information could not be resolved automatically and will require a taxonomy consult. For more information about providing a valid organism, including new species, metagenomes (microbiomes) and metagenome-assembled genomes, see https://www.ncbi.nlm.nih.gov/biosample/docs/organism/. | This is a warning that does not require submitter to fix submission data. |
35 | latlon_vs_country_warning | Values provided for 'latitude and longitude' and 'geographic location' appear to contradict each other: $MESSAGE$. | This is a warning that does not require submitter to fix submission data. MESSAGE will provide additional custom details. |
36 | taxonomy_service_failure_warning | Submission processing may be delayed due to necessary curator review. Please check spelling of organism, current information could not be resolved automatically and will require a taxonomy consult. For more information about providing a valid organism, including new species, metagenomes (microbiomes) and metagenome-assembled genomes, see https://www.ncbi.nlm.nih.gov/biosample/docs/organism/. | This is a warning that does not require submitter to fix submission data. Only issued during temporary outage of NCBI Taxonomy Service. |
37 | misplaced_bioproject | BioProject should be specified in the 'BioProject' tab or node, not in 'Attributes'. | |
38 | package_vs_organism | Organism is inappropriate for package. Please either specify a different sample type package or edit the organism according to the 'Appropriate organism and package' rules described at https://www.ncbi.nlm.nih.gov/biosample/docs/submission/validation/. | |
39 | antibiogram_invalid_antibiotic | The value provided for 'Antibiotic' is not recognized. Please use antibiotic names as listed at https://www.ncbi.nlm.nih.gov/biosample/docs/antibiogram/. If you are using a novel antibiotic, please contact NCBI at biosamplehelp@ncbi.nlm.nih.gov. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
40 | antibiogram_invalid_field | The value provided for '$FIELD$' is not valid. Please use one of the following values: $CHOICES$. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
41 | antibiogram_invalid_measurement | The value provided for 'Measurement' is not valid. Please provide a numerical value. The valid range for unit 'mg/L' is a positive number up to 1024; the valid range for unit 'mm' is between 6 and 150. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
42 | antibiogram_invalid_combination_measurement | The value provided for 'Measurement' is not valid for an antibiotic combination. The measurement value should be entered as 'X/Y' where X and Y describe the concentrations of antibiotics A and B from the antibiotic combination A-B (e.g., piperacillin-tazobactam). Alternatively, provide a single measurement value if the unit is 'mm'. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
43 | antibiogram_mismatched_units | The value provided for 'Measurement units' is not valid for the stated laboratory typing method. If the laboratory typing method is MIC, please use unit 'mg/L'. If the laboratory typing method is either agar diffusion or disk diffusion, please use unit 'mm'. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
44 | antibiogram_duplicate_rows | Duplicate rows identified in antibiogram, please remove the duplicate row. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
45 | antibiogram_conflicting_values | Multiple rows found for the same combination of 'Antibiotic', 'Laboratory typing method', 'Laboratory typing platform', and 'Vendor'. Please resolve the conflict. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
46 | antibiogram_invalid_concentration | The value provided for 'Critical Concentration' is not valid. Please provide a numerical value between 0 and 128. | Applies to mycobacterial antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
47 | antibiogram_invalid_combination_concentration | The value provided for 'Critical Concentration' is not valid for an antibiotic combination. The measurement value should be entered as 'X/Y' where X and Y describe the concentrations of antibiotics A and B from the antibiotic combination A-B (e.g., piperacillin-tazobactam). | Applies to mycobacterial antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
48 | antibiogram_method_media_restriction | Invalid DST media for selected DST method, '$DST_MEDIA$' is expected. | Applies to mycobacterial antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
49 | antibiogram_antibiotic_media_restriction | Invalid DST media for selected antibiotic, '$DST_MEDIA$' is expected. | Applies to mycobacterial antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
51 | antibiogram_missing_sample_name | Required Sample Name is missing for $COUNT$ antibiogram row(s). | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
52 | antibiogram_invalid_sample_name | The following Sample Names in the antibiogram file are not found in the sample submission: $NAMES$. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
53 | non_ascii_attribute_value | Non-ASCII format characters detected. Please check for non-standard characters in your attribute values and reformat as ASCII-only so that data can be properly consumed by dependent databases, including GenBank and Taxonomy. | Applies to Attributes like strain, culture collection, geographic location and few other. |
54 | sex_for_bacteria | Attribute 'sex' is not appropriate $MESSAGE$ | When Attribute sex is provided for bacterial, viral or fungal Organism, the detailed MESSAGE will suggest to change it to host sex or mating type. |
55 | non_ascii_identifier | Non-ASCII format characters detected in Sample Name or 'SampleId'. Please format as ASCII. | |
56 | multiple_attribute_values | Multiple values detected for '$ATTRIBUTE_NAME$'. Only one $VALUE$ is allowed. | Some Attributes cannot have multiple values, for example strain, or latitude and longitude. Collection date is allowed to have one value or range |
57 | multiple_vouchers | Multiple voucher attributes (specimen voucher, culture collection or biologic material) detected with the same $INSTITUTION_CODE$. Only one value is allowed. | These three Attributes may have multiple values, but they are subject to more complicated rule described here. |
58 | parser_missing_sample_name | Required field 'sample_name' is missing from the header line of the file. | Applies to batch submissions only. Earlier detection of incorrect batch file. |
59 | parser_missing_organism | Required field 'organism' is missing from the header line of the file. | Applies to batch submissions only. Earlier detection of incorrect batch file. |
60 | non_ascii_organism | Non-ASCII format characters detected in organism. Please format as ASCII. | |
61 | invalid_primary_id | Primary identifier $PRIMARY_ID$ is not valid. | Applies to XML deposit submissions only. The error indicates that PrimaryId provided in 'SampleId' node is not valid. |
62 | invalid_bioproject_accession | Invalid BioProject accession: $VALUE$. Please provide a valid BioProject accession with format PRJxxxxx. | Accession string provided for BioProject has invalid format. |
63 | bioproject_not_found | BioProject accession $VALUE$ does not exist. Please provide a valid BioProject accession. | Provided BioProject accession was not found in NCBI BioProject database. |
64 | increment_bioproject | Consecutive BioProjects are referenced in this submission. This is often a mistake caused by incrementing autofill in Excel, please check your file. | Applies to batch submissions only. Example: submission file has a number of consecutive BioProject accessions, for example: PRJNA12345, PRJNA12346, PRJNA12348, PRJNA12350. |
65 | invalid_bioproject_type | BioProject $VALUE$ is a RefSeq or Umbrella project, not a primary data type of BioProject. Please provide the accession of the correct primary data BioProject or create a new BioProject, if necessary. | |
66 | antibiogram_missing_antibiotic | Missing required antibiotics. Please provide data for $ANTIBIOTICS$. | Applies to antibiogram (antimicrobial susceptibility and resistance data) submissions only. |
67 | invalid_isolate_value | ***New isolate validation will be installed July 1st, 2025***. The following values will not be allowed for isolate: 'bacteria', 'clinical isolate', 'environmental', 'isolate', 'microbial', 'no', 'soil', 'sp', 'sp.', strain', 'whole organism', 'yes'. Additionally, isolate name should not start with 'subsp.' or 'serovar'. All checks are case-insensitive. Provide a valid isolate name rather than a descriptive term. This is generally the identifier that you use in your lab work for this sample. If this information is descriptive, include in isolation source. | This is a warning that does not require submitter to fix submission data. |
68 | redundant_taxonomy_attributes | Redundant values are detected in at least two of the following fields: organism; host; strain; isolate; isolation source; breed. For example, the value you supply for 'host' should not be identical to the value supplied for 'isolation source'. This check is case-insensitive and ignores white-space. | |
69 | invalid_xml | BioSample xml does not validate against schema $SCHEMA_FILE$, $ERROR_MESSAGES_FROM_XML_VALIDATOR$ | Applies to XML deposit submissions only. |
70 | usa_warning | For geographic location, if you know the USA State, please provide it formatted 'USA: State' or 'USA: State, Locality', eg, 'USA: Wisconsin' or 'USA: Wisconsin, Clark County'. | This is a warning that does not require submitter to fix submission data. |
71 | ambigous_usa_state_warning | USA State information cannot be resolved automatically from the information provided in geographic location. Please check the spelling and format of geographic location. If you know the USA State, please provide it formatted 'USA: State' or 'USA: State, Locality', eg, 'USA: Wisconsin' or 'USA: Wisconsin, Clark County'. If you do not know the State, just enter 'USA'. | This is a warning that does not require submitter to fix submission data. |
72 | long_attribute_name | Value '$ATTRIBUTE_NAME$' is too long, maximum length allowed is 256 characters. | |
73 | long_organism_name | Organism name exceeds the allowed number of 256 characters. This may be caused by including more than one organism name or including lineage information. For metagenomic or environmental samples please use source metagenome name (i.e. food metagenome, gut metagenome, plant metagenome). | |
74 | too_many_attributes | Your submission has exceeded the allowed number of 1000 attributes per sample. For more information, please write to biosamplehelp@ncbi.nlm.nih.gov. | |
75 | missing_spatio_temporal_attribute | Sample has missing geographic location and/or collection date attribute. If you do not have information for the required field(s), please provide the value as either 'missing', 'not applicable', 'not collected', 'not provided', 'restricted access', 'missing: control sample', 'missing: sample group', 'missing: synthetic construct', 'missing: lab stock', 'missing: third party data', 'missing: data agreement established pre-2023', 'missing: endangered species', or 'missing: human-identifiable'. For more information on these values see INSDC Missing Value Reporting Terms at https://www.insdc.org/submitting-standards/missing-value-reporting/. | |
76 | invalid_spatio_temporal_attribute_value | These attribute values are not valid so please correct the data. Refer to https://www.ncbi.nlm.nih.gov/biosample/docs/attributes/ for required format descriptions. If you do not have information for the required field(s), please provide the value as either 'missing', 'not applicable', 'not collected', 'not provided', 'restricted access', 'missing: control sample', 'missing: sample group', 'missing: synthetic construct', 'missing: lab stock', 'missing: third party data', 'missing: data agreement established pre-2023', 'missing: endangered species', or 'missing: human-identifiable'. For more information on these values see INSDC Missing Value Reporting Terms at https://www.insdc.org/submitting-standards/missing-value-reporting/. | |
77 | invalid_strain_value | The following values are not allowed for strain: 'bacteria', 'clinical isolate', 'environmental', 'isolate', 'microbial', 'no', 'soil', 'sp', 'sp.', strain', 'whole organism', 'yes'. Additionally, strain name should not start with 'subsp.' or 'serovar'. All checks are case-insensitive. Provide a valid strain name rather than a descriptive term. This is generally the identifier that you use in your lab work for this sample. | |
78 | null_term_organism | Organism name cannot include a null term value such as 'missing', 'not applicable', etc. Please see https://www.ncbi.nlm.nih.gov/biosample/docs/organism/ for more information on the requirements for the organism field and include a valid organism name based on those rules. |