nih-gov/www.ncbi.nlm.nih.gov/geo/info/validation.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
  <head>
    <title>GEO Metadata Validation - GEO - NCBI</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta name="author" content="geo" />
    <meta name="keywords" content="NCBI, national institutes of health, nih, database, archive, central, bioinformatics,  biomedicine, geo, gene, expression, omnibus, chips, microarrays, oligonucleotide, array, sage, CGH" />
    <meta name="description" content="Gene Expression Omnibus (GEO) is a database repository of high throughput  gene expression data and hybridization arrays, chips, microarrays." />
    <meta name="ncbiaccordion" content="collapsible: true, active: false" />
    <meta name="ncbi_app" content="geo" />
    <meta name="ncbi_pdid" content="documentation" />
    <meta name="ncbi_page" content="GEO Metadata Validation" />
    <link rel="shortcut icon" href="/geo/img/OmixIconBare.ico" />
    <link rel="stylesheet" type="text/css" href="/geo/css/reset.css" />
    <link rel="stylesheet" type="text/css" href="/geo/css/nav.css" />
    <link rel="stylesheet" type="text/css" href="/geo/css/info.css" />
    <script type="text/javascript" src="/core/jig/1.15.10/js/jig.min.js"></script>
    <script type="text/javascript" src="/geo/js/dd_menu.js"></script>
    <script type="text/javascript" src="/geo/js/info.js"></script>
    <script type="text/javascript">
                    jQuery.getScript("/core/alerts/alerts.js", function () {
                        galert(['#crumbs_login_bar', 'body &gt; *:nth-child(1)'])
                    });
                </script>
    <script type="text/javascript">
                    var ncbi_startTime = new Date();
                </script>
  </head>
  <body id="info" class="validation">
    <div id="all">
      <div id="page">
        <div id="header">
    <div id="ncbi_logo">
        <a href="/">
            <img src="/geo/img/ncbi_logo.gif" alt="NCBI Logo" />
        </a>
    </div>
    <div id="geo_logo">
        <a href="/geo/"><img src="/geo/img/geo_main.gif" alt="GEO Logo" /></a>
    </div>
</div>
        <div id="nav_bar">
    <ul id="geo_nav_bar">
        <li><a href="#">GEO Publications</a>
            <ul class="sublist">
                <li><a href="/geo/info/GEOHandoutFinal.pdf">Handout</a></li>
                <li><a href="/pmc/articles/PMC10767856/">NAR 2024 (latest)</a></li>
                <li><a href="/pmc/articles/PMC99122/">NAR 2002 (original)</a></li>
                <li><a href="/pmc/?term=10767856,4944384,3531084,3341798,3013736,2686538,2270403,1669752,1619900,1619899,539976,99122">All publications</a></li>
            </ul>
        </li>
        <li><a href="/geo/info/faq.html">FAQ</a></li>
        <li><a href="/geo/info/MIAME.html" title="Minimum Information About a Microarray Experiment">MIAME</a></li>
        <li><a href="mailto:geo@ncbi.nlm.nih.gov">Email GEO</a></li>
    </ul>
</div>
        <div id="crumbs_login_bar"><a title="NCBI home page" href="/">NCBI</a> »
                            <a id="curr_page" title="GEO home page" href="/geo/">GEO</a> »
                            <a title="GEO documentation guide" href="/geo/info/">Info</a> »
                            <span>GEO Metadata Validation</span><span id="login_status"><a href="/geo/submitter/" title="Click here to login. You need to do this only if you want to edit the contact information, submit data, see your unreleased data, or work with data already submitted by you. You do not need to login if you are here just to browse through public holdings">Login</a></span></div>
        <div id="content">
		<a name="top" id="top"></a>
		<h1>GEO Metadata Validation</h1>

        <p>
            To improve submission processing rate and maintain a high standard of metadata collection, GEO has implemented
            an automated pre-checking service for metadata completeness, formatting and content in the metadata spreadsheet
            for submissions of high-throughput sequencing data.
            After completion of FTP transfer of raw and processed data files to your personalized upload space,
            the completed metadata file should be uploaded using the
            <a href="https://submit.ncbi.nlm.nih.gov/geo/submission/meta/">Upload Metadata page</a>.
            Please note that using older versions of the metadata template may result in unexpected validation errors.
            Please ensure that you are using the latest version of the <a href="/geo/info/seq.html#metadata">high-throughput sequencing metadata template</a>
            for your submission.
        </p>
        <p>
            Upon upload, the metadata file will be scanned and checked for formatting and content within seconds.
            For example, if a raw or processed data file listed in the metadata file is not located in your personalized upload space,
            you will receive an error message alerting you about the missing file(s).
            Please correct the issue causing the error and upload your metadata file again.
            Uploading a complete and correctly formatted  metadata file will return the message "Your metadata file has been successfully uploaded".
            You will also receive an email notification with your submission summary.
            Successful uploading of the metadata file places your submission into GEO's processing queue.
        </p>
        <p>
            This page includes detailed information on specific metadata elements and provides all validation errors and explanations to assist
            with fixing any errors.
        </p>

        <ul class="doc_list">
            <li><a href="#samples">SAMPLES Section</a>
                <ul>
                    <li><a href="#molecule">molecule</a></li>
                    <li><a href="#single">single or paired-end</a></li>
                    <li><a href="#instrument">instrument model</a></li>
                    <li><a href="#library">library strategy</a></li>
                    <li><a href="#non_ascii">non-ASCII characters</a></li>
                </ul>
            </li>
            <li><a href="#errors">Validation Error Messages</a></li>
        </ul>

        <a name="samples" id="samples"></a>
        <h2>SAMPLES Section <a class="arrow" title="Back to top" href="#top">Back to top</a></h2>
        <p>
            The SAMPLES section of the metadata template is where information for each sample is provided. There are some required fields
            for every sample such as library name, title, organism, molecule, raw file.  Descriptive attribute fields such as tissue,
            cell line, cell type, treatment, genotype, disease state, etc may not be appropriate for all samples.
            However, when these fields are relevant and completed with accurate information, the resulting samples facilitate data search,
            re-use, and discovery.
        </p>
        <p>
            GEO does require inclusion of a value for at least one of the following fields for each sample: tissue, cell line or cell type.
            Failure to provide information for at least one of these fields for each sample will result in the "insufficient biological information" error.
            Please do not include duplicate information in different fields for the same sample. For example, do not include "HEK293" for both cell line and cell type.
        </p>
        <p>
            The metadata template contains drop-down menus with accepted values for the following fields: "molecule", "single or paired-end", "instrument model" and "library strategy".
            Only one value is allowed per cell for these fields.
        </p>

        <a name="molecule" id="molecule"></a>
        <h3>molecule</h3>
        <p>
            The "molecule" field must include the type of extracted molecule used to prepare the sequencing library.
            This field must have one of the following 7 values:
            <ul>
                <li>polyA RNA</li>
                <li>total RNA</li>
                <li>nuclear RNA</li>
                <li>cytoplasmic RNA</li>
                <li>genomic DNA</li>
                <li>protein</li>
                <li>other</li>
            </ul>
        </p>

        <a name="single" id="single"></a>
        <h3>single or paired-end</h3>
        <p>
            The "single or paired-end" field must include the library layout of the high-throughput raw data files. This field must be filled in with either "single" for single-end sequencing or "paired-end" for paired-end sequencing.
        </p>

        <a name="instrument" id="instrument"></a>
        <h3>instrument model</h3>
        <p>
            The instrument model field must include the complete name of the instrument model used to produce the reads for each sample. If reads for a sample were produced on multiple instruments, include only one entry in the "instrument model" field and put additional instrument models in the "description" field.
            This field must have one of the following values:
            <div data-jig="ncbiexpander" data-jigconfig="auto:false, minHeight:'214px'">
                <ul>
                    <li>454 GS</li>
                    <li>454 GS 20</li>
                    <li>454 GS FLX</li>
                    <li>454 GS FLX+</li>
                    <li>454 GS FLX Titanium</li>
                    <li>454 GS Junior</li>
                    <li>AB 5500 Genetic Analyzer</li>
                    <li>AB 5500xl Genetic Analyzer</li>
                    <li>AB 5500xl-W Genetic Analysis System</li>
                    <li>AB SOLiD 3 Plus System</li>
                    <li>AB SOLiD 4hq System</li>
                    <li>AB SOLiD 4 System</li>
                    <li>AB SOLiD PI System</li>
                    <li>AB SOLiD System</li>
                    <li>AB SOLiD System 2.0</li>
                    <li>AB SOLiD System 3.0</li>
                    <li>BGISEQ-500</li>
                    <li>Complete Genomics</li>
                    <li>DNBSEQ-G400</li>
                    <li>DNBSEQ-G400 FAST</li>
                    <li>DNBSEQ-G50</li>
                    <li>DNBSEQ-T7</li>
                    <li>Element AVITI</li>
                    <li>FASTASeq 300</li>
                    <li>GenoCare 1600</li>
                    <li>GenoLab M</li>
                    <li>GridION</li>
                    <li>GS111</li>
                    <li>Helicos HeliScope</li>
                    <li>HiSeq X Five</li>
                    <li>HiSeq X Ten</li>
                    <li>Illumina Genome Analyzer</li>
                    <li>Illumina Genome Analyzer II</li>
                    <li>Illumina Genome Analyzer IIx</li>
                    <li>Illumina HiScanSQ</li>
                    <li>Illumina HiSeq 1000</li>
                    <li>Illumina HiSeq 1500</li>
                    <li>Illumina HiSeq 2000</li>
                    <li>Illumina HiSeq 2500</li>
                    <li>Illumina HiSeq 3000</li>
                    <li>Illumina HiSeq 4000</li>
                    <li>Illumina iSeq 100</li>
                    <li>Illumina MiniSeq</li>
                    <li>Illumina MiSeq</li>
                    <li>Illumina NextSeq 500</li>
                    <li>Illumina NovaSeq 6000</li>
                    <li>Illumina NovaSeq X</li>
                    <li>Illumina NovaSeq X Plus</li>
                    <li>Ion GeneStudio S5</li>
                    <li>Ion GeneStudio S5 plus</li>
                    <li>Ion GeneStudio S5 prime</li>
                    <li>Ion Torrent Genexus</li>
                    <li>Ion Torrent PGM</li>
                    <li>Ion Torrent Proton</li>
                    <li>Ion Torrent S5</li>
                    <li>Ion Torrent S5 XL</li>
                    <li>MGISEQ-2000RS</li>
                    <li>MinION</li>
                    <li>NextSeq 1000</li>
                    <li>NextSeq 2000</li>
                    <li>NextSeq 550</li>
                    <li>Onso</li>
                    <li>PacBio RS</li>
                    <li>PacBio RS II</li>
                    <li>PromethION</li>
                    <li>Revio</li>
                    <li>Sentosa SQ301</li>
                    <li>Sequel</li>
                    <li>Sequel II</li>
                    <li>Sequel IIe</li>
                    <li>Tapestri</li>
                    <li>UG 100</li>
                </ul>
            </div>
        </p>
        <p>
            If the instrument model that you have used is not available in the drop-down menu in the metadata template,
            please ensure that you are using the latest version of the <a href="/geo/info/seq.html#metadata">metadata template</a>.
            Older versions of the metadata template do not include the current full list of instrument models.
        </p>
        <p>
            If your instrument model is not included in the latest metadata template, please choose the closest option
            that exists and include the full name of the instrument that was used in the "description" field. The "instrument model" field can include only a single value..
        </p>

        <a name="library" id="library"></a>
        <h3>library strategy</h3>
        <p>
            The "library strategy" field is for providing the type of high-throughput sequencing for the sample. For example, RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, etc.
            The drop-down menu currently includes three accepted single-cell library strategy values (scRNA-seq, snRNA-seq, and scATAC-seq). This field must have one of the following values:
            <div data-jig="ncbiexpander" data-jigconfig="auto:false, minHeight:'214px'">
                <ul>
                    <li>16S rRNA-seq</li>
                    <li>4C-Seq</li>
                    <li>ATAC-seq</li>
                    <li>BCR-Seq</li>
                    <li>Bisulfite-Seq</li>
                    <li>Bisulfite-Seq (reduced representation)</li>
                    <li>BRU-Seq</li>
                    <li>Capture-C</li>
                    <li>ChEC-seq</li>
                    <li>ChIA-PET</li>
                    <li>ChIP-Seq</li>
                    <li>ChIRP-seq</li>
                    <li>CITE-seq</li>
                    <li>CRISPR Screen</li>
                    <li>CUT&amp;Run</li>
                    <li>CUT&amp;Tag</li>
                    <li>DamID-Seq</li>
                    <li>DNase-Hypersensitivity</li>
                    <li>EM-seq</li>
                    <li>FAIRE-seq</li>
                    <li>GRO-Seq</li>
                    <li>Hi-C</li>
                    <li>HiChIP</li>
                    <li>iCLIP</li>
                    <li>MBD-Seq</li>
                    <li>MeDIP-Seq</li>
                    <li>MeRIP-Seq</li>
                    <li>miRNA-Seq</li>
                    <li>MNase-Seq</li>
                    <li>MRE-Seq</li>
                    <li>ncRNA-Seq</li>
                    <li>OTHER</li>
                    <li>PRO-Seq</li>
                    <li>Ribo-Seq</li>
                    <li>RIP-Seq</li>
                    <li>RNAmethylation</li>
                    <li>RNA-Seq</li>
                    <li>RNA-Seq (CAGE)</li>
                    <li>RNA-Seq (RACE)</li>
                    <li>scATAC-seq</li>
                    <li>scRNA-seq</li>
                    <li>SELEX</li>
                    <li>smallRNA-Seq</li>
                    <li>snRNA-Seq</li>
                    <li>Spatial Transcriptomics</li>
                    <li>ssRNA-Seq</li>
                    <li>TCR-Seq</li>
                    <li>Tn-Seq</li>
                </ul>
            </div>
        </p>
        <p>
            If you have used a sequencing strategy that is not included in the drop-down menu, choose "OTHER", which is the last term on the list.
            You can include a description of your strategy in the "experimental design" field in the STUDY section.
        </p>


        <a name="non_ascii" id="non_ascii"></a>
        <h3>non-ASCII characters</h3>
        <p>
            SAMPLES section columns such as tissue, cell line, cell type, and genotype cannot contain non-ASCII characters.
            ASCII format includes numbers 0 to 9, upper-case and lower-case letters A to Z, and some special characters
            (for example , { ^ / &lt; +). Non-ASCII format includes symbols and letters outside of English (for example, Greek letters).
            If you need to include a genotype with Greek letters, please include it in the "description" field which is not checked for presence of non-ASCII characters.
            You can include an ASCII version in the "genotype" column.
        </p>

        <a name="errors" id="errors"></a>
        <h2>Validation Error Messages <a class="arrow" title="Back to top" href="#top">Back to top</a></h2>
        <p>
            Below you will find a list of all current validations, the error message associated with each, and an explanation and more detailed information on how to address the errors.
        </p>

        <table class="overview">
            <thead>
                <tr>
                    <th>error name</th><th>error message that you will receive</th><th>explanation and how to fix</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>excel_parse_failure</td>
                    <td>Uploaded file cannot be read. The file must be in Excel version 2007 or higher with .xlsx extension.</td>
                    <td>The file is not an Excel version 2007 or higher file with .xlsx extension. GEO cannot process metadata files submitted with extension .txt,  .csv,  or .tsv. Do not compress the metadata Excel spreadsheet. A compressed metadata Excel spreadsheet cannot be read. </td>
                </tr>
                <tr>
                    <td>discontinued_template</td>
                    <td>It appears that you have used a discontinued version of the metadata spreadsheet. Please use the above link to download the newest version and resubmit.</td>
                    <td>Old versions of the metadata spreadsheet are not supported. Please download, complete, and submit the newest version of the <a href="/geo/info/examples/seq_template.xlsx">metadata spreadsheet</a>.</td>
                </tr>
                <tr>
                    <td>missing_worksheet</td>
                    <td>Uploaded file is missing required worksheet named "Metadata". Please make sure you are using our newest metadata template.</td>
                    <td>The Excel tab (also called a worksheet) containing the metadata information must be named "Metadata" or "2. Metadata Template". Any other tab name will produce the "missing_worksheet" error.  For example, do not rename the tab "RNAseq" or "ChIPseq".  Do not include multiple tabs  with metadata for separate studies in the same file. GEO needs one metadata file per study.</td>
                </tr>
                <tr>
                    <td>missing_section</td>
                    <td>Uploaded file is missing mandatory section:</td>
                    <td>The metadata tab must have sections titled STUDY, SAMPLES and PROTOCOLS. If it is a paired-end sequencing study, the metadata file must also contain a PAIRED-END EXPERIMENTS section.</td>
                </tr>
                <tr>
                    <td>empty_samples_section</td>
                    <td>SAMPLES section does not list any samples. Please make sure that library names do not start with "#" symbol since such lines are treated as comments and ignored.</td>
                    <td>Samples must be listed in the SAMPLES section. </td>
                </tr>
                <tr>
                    <td>missing_mandatory_info</td>
                    <td>Uploaded file is missing mandatory information in the STUDY or PROTOCOLS sections:</td>
                    <td>Required fields in STUDY and PROTOCOLS sections are: title, summary (abstract), experimental design, extract protocol, library construction protocol, data processing description, assembly or genome build, and processed data files format and content. A table will be provided that lists the fields in STUDY and/or PROTOCOLS sections that are empty.</td>
                </tr>
                <tr>
                    <td>missing_sample_header</td>
                    <td>SAMPLES section is missing required headers for the table:</td>
                    <td>Deleting columns from the metadata template in the SAMPLES section is not allowed and will produce the "missing_sample_header" error.  A table will be provided which lists the missing headers in the SAMPLES section.  You can add columns to the SAMPLES section for additional characteristics appropriate for your samples. For example, you could use the header "overall survival" and provide survival data for each sample.</td>
                </tr>
                <tr>
                    <td>empty_library_name</td>
                    <td>At least one of the samples has empty library name.</td>
                    <td>In the SAMPLES section at least one of the samples has empty library name. Sometimes this error is caused by non-empty cells in the SAMPLES section that are not associated with the included samples.</td>
                </tr>
                <tr>
                    <td>missing_sample_info</td>
                    <td>SAMPLES section is missing required information:</td>
                    <td>Every sample in the SAMPLES section must include information for library name, title, organism, library strategy, molecule, single or paired-end, instrument model, and raw file(s). A table will be provided which lists the missing field for each library name. Valid entries for library strategy, molecule, single or paired-end and instrument model are available from drop-down list in each of these columns in the metadata template.</td>
                </tr>
                <tr>
                    <td>duplicate_library_names</td>
                    <td>Identical library names were found. Library names must be unique. This check is case insensitive, meaning that "Control1" and "control1" will be considered identical. Identical names are:</td>
                    <td>Every library name in the SAMPLES section must be unique. A table will be provided which lists the non-unique library name and the number of times (occurrences) it was found in the SAMPLES section.</td>
                </tr>
                <tr>
                    <td>duplicate_sample_titles</td>
                    <td>Identical sample titles were found. Sample titles must be unique. This check is case insensitive, meaning that "Control1" and "control1" will be considered identical. Identical titles are:</td>
                    <td>Every title in the SAMPLES section must be unique.   A table will be provided which lists the non-unique title and the number of times (occurrences) it was found in the SAMPLES section.</td>
                </tr>
                <tr>
                    <td>invalid_contributor_format</td>
                    <td>The contributor name is not correctly formatted. The format is: 'Firstname, I, Lastname' or 'Firstname, Lastname'. First (given) name must be at least one character long. 'I' represents middle name initial and must be exactly one character. Last (family) name must be at least two characters long. List only one contributor name per row. Examples and guidance for contributor name format are available in the metadata template.</td>
                    <td>Contributor names must be provided in the accepted format of First, Last or First, I, Last.  I represents middle name initial, if present. A comma must separate the individual parts of the name.  List one contributor name per row. You can add as many extra rows with field name "contributor" as you need. </td>
                </tr>
                <tr>
                    <td>long_sample_title</td>
                    <td>Sample title is too long. Maximum length allowed is 120 characters.</td>
                    <td>Sample titles can be no longer than 120 characters.  A short sample title of 3-5 words is easy to read and displays clearly on the website.</td>
                </tr>
                <tr>
                    <td>empty_field_name</td>
                    <td>The following rows in STUDY and/or PROTOCOLS sections are missing the field name such as "contributor" or "data processing step". Add the correct field name in the cell to the left of the cell with text listed below.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>out_of_bound_text</td>
                    <td>Extra text was found beyond the first two columns in STUDY and/or PROTOCOLS sections. Please remove it. If you need to include different protocols for subsets of samples, please add all PROTOCOLS fields (extract protocol, library protocol, data processing step, etc) to the SAMPLES section as additional columns.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>raw_file_not_found</td>
                    <td>The metadata file lists raw files that are not found in your personalized upload space. Upload any missing files OR correct the metadata file by listing the exact file names (names are case-sensitive, cannot include paths, and must include file extensions such as ".gz" when compressed). The following raw files are not found in your personalized upload space:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>no_paths_allowed</td>
                    <td>A directory path to a file name has been found in the metadata file. All raw data, processed data, and supplementary files must be listed without a path. For example, use "data_matrix.txt" instead of "/Home/RNAseq/Data/Processed/data_matrix.txt". Please remove paths and resubmit.</td>
                    <td>Inclusion of a path in a file name prevents file detection on GEO's server. List the file name without path.</td>
                </tr>
                <tr>
                    <td>invalid_organism_name</td>
                    <td>Organism name(s) could not be resolved automatically in <a href="https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi">NCBI Taxonomy database</a>. The name was either not found, or it returned multiple entries. Please check spelling of organism name. Make sure you have provided a valid scientific name at species level (or lower rank, such as subspecies), e.g., Mus musculus. Do not include taxonomic authority in the name such as L. for Linnaeus. If the organism name is valid but not yet included in NCBI Taxonomy database, contact GEO using the "email us" link located above this message.</td>
                    <td>Make sure that the 'organism' field contains the scientific name of the organism at species level or below. The organism name cannot include additional text such as tissue information e.g., Mus musculus heart. List one name per column. Add extra 'organism' columns if the sample includes material from more than one organism.</td>
                </tr>
                <tr>
                    <td>missing_sample_column_name</td>
                    <td>Some columns in the SAMPLES section are not named. Add column names to the header row.</td>
                    <td>The header row in the SAMPLES section must have a name for each column for which there is sample information. Remove any unintentional text that you do not want on the sample record.</td>
                </tr>
                <tr>
                    <td>duplicate_raw_file_names</td>
                    <td>Identical raw data file names have been found in the SAMPLES section. All samples must be associated with unique raw data files. Please check raw file names for typos or inadvertent copy/paste errors. For single-cell studies with multiplexed raw data, please see the metadata template worksheet "scMulti-omics seq EXAMPLE" for guidance. If you have questions or need help, contact GEO using the "email us" link located above this message.</td>
                    <td>Each sample must be associated with independent raw data files. If your single-cell samples have been multiplexed, create one sample per sequencing library and <a href="/geo/info/seq.html#singlecell">create separate samples for individual library types such as GEX, HTO, ADT, TCR, etc</a>.</td>
                </tr>
                <tr>
                    <td>processed_data_file_not_found</td>
                    <td>The metadata file lists processed data files that are not found in your personalized upload space. Upload any missing files OR correct the metadata file by listing the exact file names (names are case-sensitive, cannot include paths, and must include file extensions such as ".gz" when compressed). List one processed data file per "processed data file" column in the SAMPLES section or "supplementary file" field in the STUDY section. If a sample (for example, input) does not have any associated processed data, leave the "processed data file" cell empty for that sample. The following processed data files are not found in your personalized upload space:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>processed_data_required</td>
                    <td>
                        Your submission does not contain any processed data file(s). Include a processed data file that contains data for all samples as a "supplementary file" in the STUDY section or provide sample-specific processed data file(s) listed in the "processed data file" field of the SAMPLES section. You can add as many "processed data file" columns as you need. Enter only one file per spreadsheet cell. If some samples (such as input) do not have associated processed data, leave the field empty for those samples.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_invalid_header</td>
                    <td>The PAIRED-END EXPERIMENTS section header row is not formatted correctly. There should be up to 4 columns, named as "file name 1", "file name 2", "file name 3" and "file name 4". All columns with file names must include a header.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_column_limit</td>
                    <td>Each row of the PAIRED-END EXPERIMENTS section can include a maximum of four files. Each row should include paired-end files from one run. The following file names were found beyond the fourth column:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_raw_file_omitted</td>
                    <td>Paired-end raw files must be listed in both sections of the metadata file. List one set of paired-end raw files (R1, R2 or I1, R1, R2, for example) per row in the PAIRED-END EXPERIMENTS section. The following raw files from SAMPLES section are not found in the PAIRED-END EXPERIMENTS section:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_with_non_paired_end_file</td>
                    <td>PAIRED-END EXPERIMENTS section includes raw files that are marked as "single" in SAMPLES section or files that are not included in "raw file" columns in SAMPLES section:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_library_mismatch</td>
                    <td>PAIRED-END EXPERIMENTS section contains at least one row with files from different libraries or samples.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>paired_end_section_duplicate_file_names</td>
                    <td>The PAIRED-END EXPERIMENTS section contains non-unique file names. Please correct all file names so that they are unique.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>duplicate_section</td>
                    <td>Each section (SERIES, SAMPLES, PROTOCOLS, PAIRED-END EXPERIMENTS) can only occur once in the "Metadata" worksheet. Upload one metadata file per data type (e.g., ChIP-seq, RNA-seq). Some sections were found more than once.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_molecule_value</td>
                    <td>Your submission contains an invalid value for "molecule". Choose an option from the dropdown list in the metadata template in the "molecule" column.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_single_paired_end_value</td>
                    <td>Your submission contains an invalid value for "single or paired-end". The value of this field must be either "single" or "paired-end".</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_instrument_model_value</td>
                    <td>Your submission contains an invalid value for "instrument model". Choose an option from the dropdown list in the metadata template in the "instrument model" column. Only one instrument model may be included per cell. If needed, enter name of additional instrument models in the "description" column.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_library_strategy_value</td>
                    <td>Your submission contains an invalid value for "library strategy". Choose an option from the dropdown list in the metadata template in the "library strategy" column.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>insufficient_biological_information</td>
                    <td>The following samples are missing biological information. At least one of these fields is required: tissue, cell line or cell type.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>non_ascii_attribute_value</td>
                    <td>Non-ASCII format characters detected in some columns in SAMPLES section. Please check for non-standard characters and reformat as ASCII-only.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>duplicate_sample_column</td>
                    <td>Some columns in SAMPLES section cannot be repeated. The following columns were found more than once:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_attribute_value</td>
                    <td>Invalid values have been found in the SAMPLES section, as listed below. Please revise.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>repeated_raw_and_processed_file</td>
                    <td>The following files were found in both raw data and processed data fields:</td>
                    <td></td>
                </tr>
                <tr>
                    <td>invalid_processed_data_type</td>
                    <td>The following files do not meet GEO's definitions for <a href="https://www.ncbi.nlm.nih.gov/geo/info/seq.html#processed">processed data</a>. Therefore they are not allowed in "processed data file" or "supplementary file" fields. Processed data submitted to GEO must include the further-processed, quantified data used to draw conclusions for the study.</td>
                    <td></td>
                </tr>
                <tr>
                    <td>human_no_raw_data</td>
                    <td>SAMPLES section is missing raw data for the following human samples. If raw data cannot be provided for these samples due to patient privacy concerns, you cannot use this page to submit your metadata file which checks for and requires raw data. <a href="https://www.ncbi.nlm.nih.gov/geo/info/submissionftp.html?type=privacy">Submit your processed data and completed metadata file to GEO by FTP</a> and then <a href="https://submit.ncbi.nlm.nih.gov/geo/submission/">notify GEO</a> of your submission.</td>
                    <td></td>
                </tr>
            </tbody>
        </table>

	</div>
      </div>
      <div id="last_mod">
                        Last modified: February 13, 2025</div>
      <div id="footer">
    <span class="helpbar">|<a href="https://www.nlm.nih.gov"> NLM </a>|<a href="https://www.nih.gov"> NIH </a>|<a href="mailto:geo@ncbi.nlm.nih.gov"> Email GEO </a>|<a href="/geo/info/disclaimer.html"> Disclaimer </a>|<a href="https://www.nlm.nih.gov/accessibility.html"> Accessibility </a>|<a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html"> HHS Vulnerability Disclosure </a>|
    </span>
</div>
    </div>
    <script type="text/javascript" src="https://www.ncbi.nlm.nih.gov/portal/portal3rc.fcgi/rlib/js/InstrumentOmnitureBaseJS/InstrumentNCBIBaseJS/InstrumentPageStarterJS.js"></script>
  </body>
</html>