343 lines
16 KiB
HTML
343 lines
16 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<title>GEO Overview - GEO - NCBI</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
<meta name="author" content="geo" />
|
|
<meta name="keywords" content="NCBI, national institutes of health, nih, database, archive, central, bioinformatics, biomedicine, geo, gene, expression, omnibus, chips, microarrays, oligonucleotide, array, sage, CGH" />
|
|
<meta name="description" content="Gene Expression Omnibus (GEO) is a database repository of high throughput gene expression data and hybridization arrays, chips, microarrays." />
|
|
<meta name="ncbiaccordion" content="collapsible: true, active: false" />
|
|
<meta name="ncbi_app" content="geo" />
|
|
<meta name="ncbi_pdid" content="documentation" />
|
|
<meta name="ncbi_page" content="GEO Overview" />
|
|
<link rel="shortcut icon" href="/geo/img/OmixIconBare.ico" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/reset.css" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/nav.css" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/info.css" />
|
|
<script type="text/javascript" src="/core/jig/1.15.10/js/jig.min.js"></script>
|
|
<script type="text/javascript" src="/geo/js/dd_menu.js"></script>
|
|
<script type="text/javascript" src="/geo/js/info.js"></script>
|
|
<script type="text/javascript">
|
|
jQuery.getScript("/core/alerts/alerts.js", function () {
|
|
galert(['#crumbs_login_bar', 'body > *:nth-child(1)'])
|
|
});
|
|
</script>
|
|
<script type="text/javascript">
|
|
var ncbi_startTime = new Date();
|
|
</script>
|
|
</head>
|
|
<body id="info" class="overview">
|
|
<div id="all">
|
|
<div id="page">
|
|
<div id="header">
|
|
<div id="ncbi_logo">
|
|
<a href="/">
|
|
<img src="/geo/img/ncbi_logo.gif" alt="NCBI Logo" />
|
|
</a>
|
|
</div>
|
|
<div id="geo_logo">
|
|
<a href="/geo/"><img src="/geo/img/geo_main.gif" alt="GEO Logo" /></a>
|
|
</div>
|
|
</div>
|
|
<div id="nav_bar">
|
|
<ul id="geo_nav_bar">
|
|
<li><a href="#">GEO Publications</a>
|
|
<ul class="sublist">
|
|
<li><a href="/geo/info/GEOHandoutFinal.pdf">Handout</a></li>
|
|
<li><a href="/pmc/articles/PMC10767856/">NAR 2024 (latest)</a></li>
|
|
<li><a href="/pmc/articles/PMC99122/">NAR 2002 (original)</a></li>
|
|
<li><a href="/pmc/?term=10767856,4944384,3531084,3341798,3013736,2686538,2270403,1669752,1619900,1619899,539976,99122">All publications</a></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="/geo/info/faq.html">FAQ</a></li>
|
|
<li><a href="/geo/info/MIAME.html" title="Minimum Information About a Microarray Experiment">MIAME</a></li>
|
|
<li><a href="mailto:geo@ncbi.nlm.nih.gov">Email GEO</a></li>
|
|
</ul>
|
|
</div>
|
|
<div id="crumbs_login_bar"><a title="NCBI home page" href="/">NCBI</a> »
|
|
<a id="curr_page" title="GEO home page" href="/geo/">GEO</a> »
|
|
<a title="GEO documentation guide" href="/geo/info/">Info</a> »
|
|
<span>GEO Overview</span><span id="login_status"><a href="/geo/submitter/" title="Click here to login. You need to do this only if you want to edit the contact information, submit data, see your unreleased data, or work with data already submitted by you. You do not need to login if you are here just to browse through public holdings">Login</a></span></div>
|
|
<div id="content">
|
|
<a name="top" id="top"></a>
|
|
<h1>GEO Overview</h1>
|
|
|
|
<ul class="page_menu">
|
|
<li><a href="#general">General overview</a></li>
|
|
<li><a href="#org">Data organization</a></li>
|
|
<li><a href="#query">Query and analysis</a></li>
|
|
</ul>
|
|
|
|
<h2>General overview</h2>
|
|
|
|
<p>
|
|
GEO is an international public repository that archives and freely distributes microarray,
|
|
next-generation sequencing, and other forms of high-throughput
|
|
functional genomics data submitted by the research community.
|
|
</p>
|
|
|
|
<p>
|
|
The three main goals of GEO are to:
|
|
</p>
|
|
<ol>
|
|
<li>
|
|
Provide a robust, versatile database in which to efficiently store high-throughput
|
|
functional genomic data (see <a href="#org">Data organization</a>)
|
|
</li>
|
|
<li>
|
|
Offer simple submission procedures and formats that support complete and well-annotated
|
|
data deposits from the research community (see <a href="/geo/info/submission.html">Submission guide</a>)
|
|
</li>
|
|
<li>
|
|
Provide user-friendly mechanisms that allow users to query, locate, review and download
|
|
studies and gene expression profiles of interest (see <a href="#query">Query and analysis</a>)
|
|
</li>
|
|
</ol>
|
|
|
|
<p>
|
|
Please see the <a href="/geo/info/">GEO Documentation</a> listings to find more information about various aspects of GEO.
|
|
</p>
|
|
|
|
<a name="org" id="org"></a>
|
|
<h2>Data organization</h2>
|
|
|
|
<p>
|
|
GEO records are organized as follows:
|
|
</p>
|
|
|
|
<img src="/geo/img/geo_overview.jpg" usemap="#overview-map" alt="Schematic overview of GEO data submission" />
|
|
|
|
<map id="overview-map" name="overview-map">
|
|
<area href="#a" alt="Text description of the array" title="Text description of the array" shape="rect" coords="76,79, 256,175" />
|
|
<area href="#b" alt="Text tab-delimited table of the array template" title="Text tab-delimited table of the array template" shape="rect" coords="76,177, 257,297" />
|
|
<area href="#c" alt="Text description of a biological sample" title="Text description of a biological sample" shape="rect" coords="303,78, 488,175" />
|
|
<area href="#d" alt="Text tab-delimited table of processed hybridization result" title="Text tab-delimited table of processed hybridization result" shape="rect" coords="307,176, 495,280" />
|
|
<area href="#e" alt="Original raw data file" title="Original raw data file" shape="rect" coords="312,281, 479,300" />
|
|
<area href="#f" alt="Text description of the overall experiment" title="Text description of the overall experiment" shape="rect" coords="563,81, 750,275" />
|
|
<area href="#g" alt="Original raw data file" title="Original raw data file" shape="rect" coords="565,276, 736,302" />
|
|
<area href="#h" alt="DataSet" title="DataSet" shape="rect" coords="142,394, 398,549" />
|
|
<area href="#i" alt="Profile" title="Profile" shape="rect" coords="421,395, 678,552" />
|
|
</map>
|
|
|
|
<table class="overview">
|
|
<tbody>
|
|
<tr>
|
|
<th rowspan="2">Platform</th>
|
|
<td rowspan="2">
|
|
<a id="a"></a>
|
|
<a id="b"></a>
|
|
<h3>Platform records are supplied by submitters</h3>
|
|
A Platform record is composed of a summary description of the array or sequencer and, for array-based Platforms,
|
|
a data table defining the array template.Each Platform record is assigned a unique and stable
|
|
GEO accession number (GPLxxx). A Platform may reference many Samples
|
|
that have been submitted by multiple submitters.
|
|
<p class="example"><a href="/geo/query/acc.cgi?acc=GPL341">Example Platform record »</a></p>
|
|
</td>
|
|
<td class="letter">A</td>
|
|
<td><b>Text description of the array or sequencer</b></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="letter">B</td>
|
|
<td><b>Text tab-delimited table of the array template</b></td>
|
|
</tr>
|
|
<tr>
|
|
<th rowspan="3">Sample</th>
|
|
<td rowspan="3">
|
|
<a id="c"></a>
|
|
<a id="d"></a>
|
|
<a id="e"></a>
|
|
<h3>Sample records are supplied by submitters</h3>
|
|
A Sample record describes the conditions under which an individual Sample was handled,
|
|
the manipulations it underwent, and the abundance measurement of each
|
|
element derived from it. Each Sample record is assigned a unique and
|
|
stable GEO accession number (GSMxxx). A Sample entity must reference
|
|
only one Platform and may be included in multiple Series.
|
|
<p class="example"><a class="example" href="/geo/query/acc.cgi?acc=GSM81022">Example Sample record »</a></p>
|
|
</td>
|
|
<td class="letter">C</td>
|
|
<td><b>Text description of the biological sample and protocols to which it was subjected</b></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="letter">D</td>
|
|
<td>
|
|
<b>Text tab-delimited table of processed hybridization result<br /></b>
|
|
<span>(may optionally include raw data columns)</span>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td class="letter">E</td>
|
|
<td><b>Original raw data file, or processed sequence data file</b></td>
|
|
</tr>
|
|
<tr>
|
|
<th rowspan="2">Series</th>
|
|
<td rowspan="2">
|
|
<a id="f"></a>
|
|
<a id="g"></a>
|
|
<h3>Series records are supplied by submitters</h3>
|
|
A Series record links together a group of related Samples and provides a focal point and description of the whole study.
|
|
Series records may also contain tables describing extracted data,
|
|
summary conclusions, or analyses. Each Series record is assigned a
|
|
unique and stable GEO accession number (GSExxx).
|
|
<p class="example"><a href="/geo/query/acc.cgi?acc=GSE3541">Example Series record »</a></p>
|
|
</td>
|
|
<td class="letter">F</td>
|
|
<td><b>Text description of the overall experiment</b></td>
|
|
</tr>
|
|
<tr>
|
|
<td class="letter">G</td>
|
|
<td><b>Tar archive of original raw data files, or processed sequence data files</b></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<p>
|
|
Selected primary records undergo an upper-level of rendering into DataSet and gene Profile records:
|
|
</p>
|
|
|
|
<table class="overview">
|
|
<tbody>
|
|
<tr>
|
|
<th>DataSet</th>
|
|
<td>
|
|
<a id="h"></a>
|
|
<h3>DataSet records are assembled by GEO curators</h3>
|
|
<p>
|
|
As explained above, A GEO Series record is an original
|
|
submitter-supplied record that summarizes an experiment.
|
|
These data are reassembled by GEO staff into GEO Dataset records (GDSxxx).
|
|
</p>
|
|
<p>
|
|
A DataSet represents a curated collection of biologically
|
|
and statistically comparable GEO Samples and forms the basis of GEO's
|
|
suite of data display and analysis tools.
|
|
</p>
|
|
<p>
|
|
Samples within a DataSet refer to the same Platform, that is, they share a
|
|
common set of array elements. Value measurements for each Sample within
|
|
a DataSet are assumed to be calculated in an equivalent manner, that is,
|
|
considerations such as background processing and normalization are
|
|
consistent across the DataSet. Information reflecting experimental
|
|
factors is provided through DataSet subsets.
|
|
</p>
|
|
<p>
|
|
Both Series and DataSets are searchable using the <a href="/gds/">GEO DataSets</a>
|
|
interface, but only DataSets form the basis of GEO's advanced data display and analysis tools
|
|
including gene expression profile charts and DataSet clusters.
|
|
Not all submitted data are suitable for DataSet assembly and we are experiencing a backlog in DataSet creation,
|
|
so not all Series have corresponding DataSet record(s).
|
|
</p>
|
|
<p>
|
|
For more information, see <a href="/geo/info/datasets.html">About GEO DataSets</a> page.
|
|
</p>
|
|
<p class="example"><a href="/sites/GDSbrowser?acc=GDS2225">Example DataSet record »</a></p>
|
|
</td>
|
|
<td class="letter">H</td>
|
|
<td class="center"><img alt="Cluster image" src="/geo/img/thumbcluster.png" /></td>
|
|
</tr>
|
|
<tr>
|
|
<th>Profile</th>
|
|
<td>
|
|
<a id="i"></a>
|
|
<h3>Profiles are derived from DataSets</h3>
|
|
<p>
|
|
A Profile consists of the expression measurements for an individual gene across all Samples in a DataSet.
|
|
Profiles can be searched using the <a href="/geoprofiles/">GEO Profiles</a> interface.
|
|
</p>
|
|
<p>
|
|
For more information, see <a href="/geo/info/profiles.html">About GEO Profiles</a> page.
|
|
</p>
|
|
<p class="example"><a class="example" href="/geoprofiles?term=GDS2225[ACCN]">Example Profile records »</a></p>
|
|
</td>
|
|
<td class="letter">I</td>
|
|
<td><img alt="Profile image" src="/geo/img/profileIcon.png" /></td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
|
|
<a name="query" id="query"></a>
|
|
<h2>Query and Analysis <a href="#top" class="arrow" title="Back to top"></a></h2>
|
|
|
|
<p>GEO data can be retrieved and analyzed in several ways:</p>
|
|
|
|
<ul class="geo_doc_list">
|
|
<li>
|
|
<span>
|
|
To look at a particular GEO record for which you have the accession number,
|
|
use the <i>GEO accession box</i> located on the <a href="/geo/">GEO homepage</a> or at the top of each GEO record.
|
|
</span>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
To download data, see the various options described on the <a href="/geo/info/download.html">Download GEO data</a> page.
|
|
</span>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
To quickly locate data relevant to your interests, search
|
|
<a href="/gds/">GEO DataSets</a> and
|
|
<a href="/geoprofiles/">GEO Profiles</a>:
|
|
</span>
|
|
|
|
<ul>
|
|
<li>
|
|
<p class="last">
|
|
<a href="/gds/">GEO DataSets</a> is a <em>study-level</em>
|
|
database which users can search for studies relevant to their interests.
|
|
The database stores descriptions of all original submitter-supplied records, as well as curated DataSets.
|
|
More information about GEO DataSets and how to interpret GEO DataSets results pages
|
|
can be found on the <a href="/geo/info/datasets.html">About GEO DataSets</a> page.
|
|
</p>
|
|
</li>
|
|
<li>
|
|
<p class="last">
|
|
<a href="/geoprofiles/">GEO Profiles</a> is a <em>gene-level</em>
|
|
database which users can search for gene expression profiles relevant to their interests.
|
|
More information about GEO Profiles and how to interpret GEO Profiles results pages
|
|
can be found on the <a href="/geo/info/profiles.html">About GEO Profiles</a> page.
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
GEO DataSet and GEO Profiles searches may be effectively performed by simply entering appropriate keywords and phrases into the search box.
|
|
However, given the large volumes of data stored in these databases,
|
|
it is often useful to perform more refined queries in order to filter down to the most relevant data.
|
|
Examples and full details about how to perform sophisticated queries are provided in the
|
|
<a href="/geo/info/qqtutorial.html">Querying GEO DataSets and GEO Profiles</a> page. Additionally,
|
|
the <em>Advanced Search</em> tool,
|
|
linked at the head of the GEO DataSets and GEO Profiles pages, assists greatly in the construction of complex queries:
|
|
</p>
|
|
|
|
<ul>
|
|
<li><a href="/gds/advanced">GEO DataSets Advanced Search</a></li>
|
|
<li><a href="/geoprofiles/advanced">GEO Profiles Advanced Search</a></li>
|
|
</ul>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
Once you have identified a DataSet of interest there are several features on the DataSet record that help
|
|
identify interesting gene expression profiles within that study, including a t-test tool and clusters.
|
|
Full information about these features is provided on the <a href="/geo/info/datasets.html">About GEO DataSets</a> page.
|
|
</span>
|
|
</li>
|
|
<li class="last">
|
|
<span>
|
|
Once you have identified gene expression profiles of interest
|
|
there are several links on the Profile records that help identify additional genes of interest,
|
|
including similarly expressed genes or genes within close proximity on the chromosome.
|
|
Full information about these links is provided on the <a href="/geo/info/profiles.html">About GEO Profiles</a> page.
|
|
</span>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
<div id="last_mod">
|
|
Last modified: July 16, 2024</div>
|
|
<div id="footer">
|
|
<span class="helpbar">|<a href="https://www.nlm.nih.gov"> NLM </a>|<a href="https://www.nih.gov"> NIH </a>|<a href="mailto:geo@ncbi.nlm.nih.gov"> Email GEO </a>|<a href="/geo/info/disclaimer.html"> Disclaimer </a>|<a href="https://www.nlm.nih.gov/accessibility.html"> Accessibility </a>|<a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html"> HHS Vulnerability Disclosure </a>|
|
|
</span>
|
|
</div>
|
|
</div>
|
|
<script type="text/javascript" src="https://www.ncbi.nlm.nih.gov/portal/portal3rc.fcgi/rlib/js/InstrumentOmnitureBaseJS/InstrumentNCBIBaseJS/InstrumentPageStarterJS.js"></script>
|
|
</body>
|
|
</html>
|