nih-gov/www.ncbi.nlm.nih.gov/geo/info/MINiML.html

486 lines
30 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title>MINiML - GEO - NCBI</title>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="author" content="geo" />
<meta name="keywords" content="NCBI, national institutes of health, nih, database, archive, central, bioinformatics, biomedicine, geo, gene, expression, omnibus, chips, microarrays, oligonucleotide, array, sage, CGH" />
<meta name="description" content="Gene Expression Omnibus (GEO) is a database repository of high throughput gene expression data and hybridization arrays, chips, microarrays." />
<meta name="ncbiaccordion" content="collapsible: true, active: false" />
<meta name="ncbi_app" content="geo" />
<meta name="ncbi_pdid" content="documentation" />
<meta name="ncbi_page" content="MINiML" />
<link rel="shortcut icon" href="/geo/img/OmixIconBare.ico" />
<link rel="stylesheet" type="text/css" href="/geo/css/reset.css" />
<link rel="stylesheet" type="text/css" href="/geo/css/nav.css" />
<link rel="stylesheet" type="text/css" href="/geo/css/info.css" />
<script type="text/javascript" src="/core/jig/1.15.10/js/jig.min.js"></script>
<script type="text/javascript" src="/geo/js/dd_menu.js"></script>
<script type="text/javascript" src="/geo/js/info.js"></script>
<script type="text/javascript">
jQuery.getScript("/core/alerts/alerts.js", function () {
galert(['#crumbs_login_bar', 'body &gt; *:nth-child(1)'])
});
</script>
<script type="text/javascript">
var ncbi_startTime = new Date();
</script>
</head>
<body id="info" class="MINiML">
<div id="all">
<div id="page">
<div id="header">
<div id="ncbi_logo">
<a href="/">
<img src="/geo/img/ncbi_logo.gif" alt="NCBI Logo" />
</a>
</div>
<div id="geo_logo">
<a href="/geo/"><img src="/geo/img/geo_main.gif" alt="GEO Logo" /></a>
</div>
</div>
<div id="nav_bar">
<ul id="geo_nav_bar">
<li><a href="#">GEO Publications</a>
<ul class="sublist">
<li><a href="/geo/info/GEOHandoutFinal.pdf">Handout</a></li>
<li><a href="/pmc/articles/PMC10767856/">NAR 2024 (latest)</a></li>
<li><a href="/pmc/articles/PMC99122/">NAR 2002 (original)</a></li>
<li><a href="/pmc/?term=10767856,4944384,3531084,3341798,3013736,2686538,2270403,1669752,1619900,1619899,539976,99122">All publications</a></li>
</ul>
</li>
<li><a href="/geo/info/faq.html">FAQ</a></li>
<li><a href="/geo/info/MIAME.html" title="Minimum Information About a Microarray Experiment">MIAME</a></li>
<li><a href="mailto:geo@ncbi.nlm.nih.gov">Email GEO</a></li>
</ul>
</div>
<div id="crumbs_login_bar"><a title="NCBI home page" href="/">NCBI</a> »
<a id="curr_page" title="GEO home page" href="/geo/">GEO</a> »
<a title="GEO documentation guide" href="/geo/info/">Info</a> »
<span>MINiML</span><span id="login_status"><a href="/geo/submitter/" title="Click here to login. You need to do this only if you want to edit the contact information, submit data, see your unreleased data, or work with data already submitted by you. You do not need to login if you are here just to browse through public holdings">Login</a></span></div>
<div id="content">
<a name="top" id="top"></a>
<h1 title="MINiML">MINiML (MIAME Notation in Markup Language)</h1>
<ul class="doc_list">
<li><a href="#what">What is MINiML?</a></li>
<li><a href="#why">Why another data exchange format?</a> </li>
<li><a href="#guidelines">MINiML Elements and Content Guidelines </a></li>
</ul>
<a name="what" id="what"></a>
<h2>What is MINiML?<a class="arrow" href="#top" title="Back to top">Back to top</a></h2>
<p>
MINiML (<a href="/geo/info/MIAME.html">MIAME</a> Notation in Markup Language, pronounced 'minimal') is a
data exchange format optimized for microarray gene expression data, as well as many other types of
high-throughput molecular abundance data. MINiML assumes only very basic relations between objects:
Platform (e.g., array), Sample (e.g., hybridization), and Series (experiment). MINiML captures all
components of the <a href="/geo/info/MIAME.html">MIAME</a> checklist, as well as any additional
information that the submitter wants to provide. MINiML uses XML Schema as syntax.
</p>
<p>
<a href="/geo/info/MINiML.xsd">MINiML XML Schema definition</a> is available.
</p>
<a name="why" id="why"></a>
<h2>Why another data exchange format?<a class="arrow" href="#top" title="Back to top">Back to top</a></h2>
<p>
GEO has been using <a href="/geo/info/soft.html">SOFT</a> (Simple Omnibus Format in Text) as a data exchange format.
An advantage of SOFT is its simplicity which makes it suitable for parsing and generation by virtually any text
manipulating language. However, excellent tools exist today to programmatically support XML formats and provide
better document structure, syntax definitions or data rendering. MINiML is effectively an XML rendering of SOFT.
</p>
<p>GEO fully supports both SOFT and MINiML.</p>
<a name="guidelines" id="guidelines"></a>
<h2>MINiML Elements and Content Guidelines<a class="arrow" href="#top" title="Back to top">Back to top</a></h2>
<p>The table below provides content guidelines and constraints for most MINiML elements; it is not exhaustive.</p>
<div id="guidelines_tabs" class="jig-ncbitabs">
<ul>
<li><a href="#platform_tab">Platform</a></li>
<li><a href="#sample_tab">Sample</a></li>
<li><a href="#series_tab">Series</a></li>
</ul>
<div id="platform_tab">
<table class="overview">
<thead>
<tr>
<th>Element name</th><th>Number of allowed labels</th><th>Allowed values and constraints</th><th>Content Guidelines</th>
</tr>
</thead>
<tbody>
<tr>
<th>Title</th>
<td>required</td>
<td>string of length 1-120 characters, must be unique within local file and over all previously submitted Platforms for that submitter</td>
<td>Provide a unique title that describes your Platform. We suggest that you use the system '[institution/lab][species][number of features][version]', e.g. "FHCRC Mouse 15K v1.0".</td>
</tr>
<tr>
<th>Distribution</th>
<td>required</td>
<td>commercial, non-commercial, custom-commercial, or virtual</td>
<td>Microarrays are 'commercial', 'non-commercial', or 'custom-commercial' in accordance with how the array was manufactured . Use 'virtual' only if creating a virtual definition for MS, MPSS, SARST, or RT-PCR data.</td>
</tr>
<tr>
<th>Technology</th>
<td>required</td>
<td>spotted DNA/cDNA, spotted oligonucleotide, in situ oligonucleotide, antibody, tissue, SARST, RT-PCR, MS, or MPSS</td>
<td>Select the category that best describes the Platform technology.</td>
</tr>
<tr>
<th>Organism</th>
<td>required and unbounded</td>
<td>use standard <a href="/taxonomy/">NCBI Taxonomy</a> nomenclature</td>
<td>Identify the organism(s) from which the features on the Platform were designed or derived. </td>
</tr>
<tr>
<th>Manufacturer</th>
<td>required</td>
<td>any</td>
<td>Provide the name of the company, facility or laboratory where the array was manufactured or produced.</td>
</tr>
<tr>
<th>Manufacture-Protocol</th>
<td>required</td>
<td>any</td>
<td>Describe the array manufacture protocol. Include as much detail as possible, e.g., clone/primer set identification and preparation, strandedness/length, arrayer hardware/software, spotting protocols.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Catalog-Number</th>
<td>optional</td>
<td>any</td>
<td>Provide the manufacturer catalog number for commercially-available arrays.</td>
</tr>
<tr>
<th>Web-Link</th>
<td>optional and unbounded</td>
<td>valid URL</td>
<td>Specify a Web link that directs users to supplementary information about the array. Please restrict to Web sites that you know are stable. </td>
</tr>
<tr>
<th>Support</th>
<td>optional</td>
<td>any</td>
<td>Provide the surface type of the array, e.g., glass, nitrocellulose, nylon, silicon, unknown.</td>
</tr>
<tr>
<th>Coating</th>
<td>optional</td>
<td>any</td>
<td>Provide the coating of the array, e.g., aminosilane, quartz, polysine, unknown.</td>
</tr>
<tr>
<th>Description</th>
<td>optional</td>
<td>any</td>
<td>Provide any additional descriptive information not captured in another field, e.g., array and/or feature physical dimensions, element grid system.</td>
</tr>
<tr>
<th>Contributor-Ref</th>
<td>optional and unbounded</td>
<td></td>
<td>List all people associated with this array design.</td>
</tr>
<tr>
<th>Pubmed_ID</th>
<td>optional and unbounded</td>
<td>an integer</td>
<td>Specify a valid PubMed identifier (PMID) that references a published article that describes the array. </td>
</tr>
<tr>
<th>Data-Table</th>
<td>required</td>
<td>a plain text (ASCII) tab-delimited table</td>
<td>Data-Tables can be supplied either within the MINiML file (Internal-Data), or can be external files (External-Data).
External-Data files should be zipped or tarred together with the MINiML file at the time of submission.<br />
A full description of Platform data tables, required columns, content and restrictions is provided in the
<a href="/geo/info/platform.html">Platform data table guidelines</a>.
One difference to note is that data tables do not have headers in MINiML files - table columns are defined by position.
</td>
</tr>
<tr>
<th>Supplementary-Data</th>
<td>optional and unbounded</td>
<td>a link or path to supplementary data </td>
<td>Examples of Platform supplementary data include original GAL and CSV files. Supplementary files can be zipped or tarred together with the MINiML file at time of submission.</td>
</tr>
</tbody>
</table>
</div>
<div id="sample_tab">
<table class="overview">
<thead>
<tr>
<th>Element name</th><th>Number of allowed labels</th><th>Allowed values and constraints</th><th>Content Guidelines</th>
</tr>
</thead>
<tbody>
<tr>
<th>Title</th>
<td>required</td>
<td>string of length 1-120 characters, must be unique within local file and over all previously submitted Samples for that submitter</td>
<td>Provide a unique title that describes this Sample. We suggest that you use the system [biomaterial]-[condition(s)]-[replicate number], e.g., Muscle_exercised_60min_rep2.</td>
</tr>
<tr>
<th>Channel-Count</th>
<td>required</td>
<td> nomenclature</td>
<td>State the number of channels in the experiment, e.g., two-color hybridizations are typically 2-channel, Affymetrix hybridizations are typically 1-channel.</td>
</tr>
<tr>
<th>Source</th>
<td>required per channel</td>
<td>any</td>
<td>Briefly identify the biological material and the experimental variable(s) for this Sample, e.g., vastus lateralis muscle, exercised, 60 min.</td>
</tr>
<tr>
<th>Organism</th>
<td>required and unbounded per channel</td>
<td>use standard <a href="/taxonomy/">NCBI Taxonomy</a> nomenclature</td>
<td>Identify the organism(s) from which the biological material was derived.</td>
</tr>
<tr>
<th>Characteristics</th>
<td>required per channel</td>
<td>any</td>
<td>List all available characteristics of the biological source e.g.,<br />
Strain: C57BL/6 <br />
Gender: female <br />
Age: 45 days<br />
Tissue: bladder tumor<br />
Tumor stage: Ta<br />
</td>
</tr>
<tr>
<th>Biomaterial-Provider</th>
<td>optional per channel</td>
<td>any</td>
<td>Specify the name of the company, laboratory or person that provided the biological material.</td>
</tr>
<tr>
<th>Treatment-Protocol</th>
<td>optional per channel</td>
<td>any</td>
<td>Describe any treatments applied to the biological material prior to extract preparation.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Growth-Protocol</th>
<td>optional per channel</td>
<td>any</td>
<td>Describe the conditions that were used to grow or maintain organisms or cells prior to extract preparation.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Molecule</th>
<td>required per channel</td>
<td>total RNA, polyA RNA, cytoplasmic RNA, nuclear RNA, genomic DNA, protein, or other</td>
<td>Specify the type of molecule that was extracted from the biological material.</td>
</tr>
<tr>
<th>Extract-Protocol</th>
<td>optional per channel</td>
<td>any</td>
<td>Describe the protocol used to isolate the extract material.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Label</th>
<td>required per channel</td>
<td>any</td>
<td>Specify the compound used to label the extract e.g., biotin, Cy3, Cy5, 33P.</td>
</tr>
<tr>
<th>Label-Protocol</th>
<td>optional per channel</td>
<td>any</td>
<td>Describe the protocol used to label the extract. Please provide complete protocol descriptions within your submission.</td>
</tr>
<tr>
<th>Hybridization-Protocol</th>
<td>optional</td>
<td>any</td>
<td>Describe the protocols used for hybridization, blocking and washing, and any post-processing steps such as staining.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Scan-Protocol</th>
<td>optional</td>
<td>any</td>
<td>Describe the scanning and image acquisition protocols, hardware, and software.
Please provide complete protocol descriptions within your submission.
</td>
</tr>
<tr>
<th>Data-Processing</th>
<td>required</td>
<td>any</td>
<td>Provide details of how data in the VALUE column of your table were generated and calculated, i.e., normalization method,
data selection procedures and parameters, transformation algorithm and scaling parameters (e.g., MAS5.0, scaled to 100). </td>
</tr>
<tr>
<th>Description</th>
<td>required</td>
<td>any</td>
<td>Include any additional information not provided in the other fields, or paste in broad descriptions that cannot be easily dissected into the other fields.</td>
</tr>
<tr>
<th>Platform-Ref</th>
<td>required</td>
<td>a valid Platform identifier</td>
<td>Reference the Platform iid upon which this hybridization was performed.</td>
</tr>
<tr>
<th>Data-Table</th>
<td>required</td>
<td>a plain text (ASCII) tab-delimited table</td>
<td>Data-Tables can be supplied either within the MINiML file (Internal-Data), or can be external files (External-Data).
External-Data files should be zipped or tarred together with the MINiML file at the time of submission. <br />
One difference to note is that data tables do not have headers in MINiML files - table columns are defined by position.
</td>
</tr>
<tr>
<th>Supplementary-Data</th>
<td>required</td>
<td>a reference to supplementary data, or type="none" </td>
<td>Examples of Sample supplementary data include original GPR, CEL, EXP, RPT, CAB, and TIFF files. Supplementary files should be zipped or tarred together with the MINiML file at time of submission. Provision of supplementary raw data files facilitates the unambiguous interpretation of data and potential verification of conclusions as set forth in the MIAME guidelines.
</td>
</tr>
<tr>
<th>Anchor</th>
<td>required for SAGE Samples</td>
<td>NlaIII or Sau3A</td>
<td>Supply for SAGE submissions only. State the enzyme anchor.</td>
</tr>
<tr>
<th>Type</th>
<td>required for SAGE Samples</td>
<td>RNA, genomic, protein, SAGE, MPSS, SARST, mixed</td>
<td>Supply for SAGE submissions only (this field is derived automatically for other Sample types using the Molecule field).</td>
</tr>
<tr>
<th>Tag-Count</th>
<td>required for SAGE Samples</td>
<td>an integer</td>
<td>Supply for SAGE submissions only. State the sum number of tags quantified in this Sample.</td>
</tr>
<tr>
<th>Tag-Length</th>
<td>required for SAGE Samples</td>
<td>an integer</td>
<td>Supply for SAGE submissions only. State the base pair length of the SAGE tags, excluding anchor sequence.</td>
</tr>
</tbody>
</table>
</div>
<div id="series_tab">
<table class="overview">
<thead>
<tr>
<th>Element name</th><th>Number of allowed labels</th><th>Allowed values and constraints</th><th>Content Guidelines</th>
</tr>
</thead>
<tbody>
<tr>
<th>Title</th>
<td>required</td>
<td>string of length 1-120 characters, must be unique within local file and over all previously submitted Series for that submitter</td>
<td>Provide a unique title that describes the overall study.</td>
</tr>
<tr>
<th>Summary</th>
<td>required</td>
<td>any</td>
<td>Summarize the goals and objectives of this study. The abstract from the associated publication may be suitable.</td>
</tr>
<tr>
<th>Type</th>
<td>required</td>
<td>any</td>
<td>Enter keyword(s) that generally describe the type of study. Examples include: time course, dose response, comparative genomic hybridization, ChIP-chip, cell type comparison, disease state analysis, stress response, genetic modification, etc.</td>
</tr>
<tr>
<th>Overall-Design</th>
<td>required</td>
<td>any</td>
<td>Provide a brief description of the experimental design. Indicate how many Samples are analyzed, if replicates are included, are there control and/or reference Samples, dye-swaps, etc.</td>
</tr>
<tr>
<th>Pubmed-ID</th>
<td>optional and unbounded</td>
<td>an integer</td>
<td>Specify a valid PubMed identifier (PMID) that references a published article describing this study.
Most commonly, this information is not available at the time of submission - it can be added later once the data are published.</td>
</tr>
<tr>
<th>Web-Link</th>
<td>optional and unbounded</td>
<td>valid URL</td>
<td>Specify a Web link that directs users to supplementary information about the study. Please restrict to Web sites that you know are stable. </td>
</tr>
<tr>
<th>Contributor-Ref</th>
<td>optional and unbounded</td>
<td></td>
<td>List all people associated with this study.</td>
</tr>
<tr>
<th>Sample-Ref</th>
<td>required and unbounded</td>
<td>valid Sample identifiers</td>
<td>Reference the Sample iid that make up this experiment.</td>
</tr>
<tr>
<th>Variable<br />
<span class="margin_left">Factor</span>
<span class="margin_left">Description</span>
<span class="margin_left">Sample-Ref</span>
</th>
<td>optional and unbounded</td>
<td>Allowed 'Factors' include: <br />
dose, time, tissue, strain, gender, cell line, development stage, age, agent, cell type, infection, isolate, metabolism, shock, stress, temperature, specimen, disease state, protocol, growth protocol, genotype/variation, species, individual, or other</td>
<td>Indicate and describe the variable type(s) investigated in this study.
NOTE - this information does not appear in Series records or downloads, but will be used to assemble corresponding GEO DataSet records.
</td>
</tr>
<tr>
<th>Repeats<br />
<span class="margin_left">Factor</span>
<span class="margin_left">Sample-Ref</span></th>
<td>optional and unbounded</td>
<td>Allowed 'Factors' include:<br />
biological replicate <br />
technical replicate - extract <br />
technical replicate - labeled-extract</td>
<td>Indicate the repeat type(s).
NOTE - this information does not appear in Series records or downloads, but will be used to assemble corresponding GEO DataSet records.
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div id="last_mod">
Last modified: July 16, 2024</div>
<div id="footer">
<span class="helpbar">|<a href="https://www.nlm.nih.gov"> NLM </a>|<a href="https://www.nih.gov"> NIH </a>|<a href="mailto:geo@ncbi.nlm.nih.gov"> Email GEO </a>|<a href="/geo/info/disclaimer.html"> Disclaimer </a>|<a href="https://www.nlm.nih.gov/accessibility.html"> Accessibility </a>|<a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html"> HHS Vulnerability Disclosure </a>|
</span>
</div>
</div>
<script type="text/javascript" src="https://www.ncbi.nlm.nih.gov/portal/portal3rc.fcgi/rlib/js/InstrumentOmnitureBaseJS/InstrumentNCBIBaseJS/InstrumentPageStarterJS.js"></script>
</body>
</html>