285 lines
14 KiB
HTML
285 lines
14 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<title>Download GEO data - GEO - NCBI</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
|
|
<meta name="author" content="geo" />
|
|
<meta name="keywords" content="NCBI, national institutes of health, nih, database, archive, central, bioinformatics, biomedicine, geo, gene, expression, omnibus, chips, microarrays, oligonucleotide, array, sage, CGH" />
|
|
<meta name="description" content="Gene Expression Omnibus (GEO) is a database repository of high throughput gene expression data and hybridization arrays, chips, microarrays." />
|
|
<meta name="ncbiaccordion" content="collapsible: true, active: false" />
|
|
<meta name="ncbi_app" content="geo" />
|
|
<meta name="ncbi_pdid" content="documentation" />
|
|
<meta name="ncbi_page" content="Download GEO data" />
|
|
<link rel="shortcut icon" href="/geo/img/OmixIconBare.ico" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/reset.css" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/nav.css" />
|
|
<link rel="stylesheet" type="text/css" href="/geo/css/info.css" />
|
|
<script type="text/javascript" src="/core/jig/1.15.10/js/jig.min.js"></script>
|
|
<script type="text/javascript" src="/geo/js/dd_menu.js"></script>
|
|
<script type="text/javascript" src="/geo/js/info.js"></script>
|
|
<script type="text/javascript">
|
|
jQuery.getScript("/core/alerts/alerts.js", function () {
|
|
galert(['#crumbs_login_bar', 'body > *:nth-child(1)'])
|
|
});
|
|
</script>
|
|
<script type="text/javascript">
|
|
var ncbi_startTime = new Date();
|
|
</script>
|
|
</head>
|
|
<body id="info" class="download">
|
|
<div id="all">
|
|
<div id="page">
|
|
<div id="header">
|
|
<div id="ncbi_logo">
|
|
<a href="/">
|
|
<img src="/geo/img/ncbi_logo.gif" alt="NCBI Logo" />
|
|
</a>
|
|
</div>
|
|
<div id="geo_logo">
|
|
<a href="/geo/"><img src="/geo/img/geo_main.gif" alt="GEO Logo" /></a>
|
|
</div>
|
|
</div>
|
|
<div id="nav_bar">
|
|
<ul id="geo_nav_bar">
|
|
<li><a href="#">GEO Publications</a>
|
|
<ul class="sublist">
|
|
<li><a href="/geo/info/GEOHandoutFinal.pdf">Handout</a></li>
|
|
<li><a href="/pmc/articles/PMC10767856/">NAR 2024 (latest)</a></li>
|
|
<li><a href="/pmc/articles/PMC99122/">NAR 2002 (original)</a></li>
|
|
<li><a href="/pmc/?term=10767856,4944384,3531084,3341798,3013736,2686538,2270403,1669752,1619900,1619899,539976,99122">All publications</a></li>
|
|
</ul>
|
|
</li>
|
|
<li><a href="/geo/info/faq.html">FAQ</a></li>
|
|
<li><a href="/geo/info/MIAME.html" title="Minimum Information About a Microarray Experiment">MIAME</a></li>
|
|
<li><a href="mailto:geo@ncbi.nlm.nih.gov">Email GEO</a></li>
|
|
</ul>
|
|
</div>
|
|
<div id="crumbs_login_bar"><a title="NCBI home page" href="/">NCBI</a> »
|
|
<a id="curr_page" title="GEO home page" href="/geo/">GEO</a> »
|
|
<a title="GEO documentation guide" href="/geo/info/">Info</a> »
|
|
<span>Download GEO data</span><span id="login_status"><a href="/geo/submitter/" title="Click here to login. You need to do this only if you want to edit the contact information, submit data, see your unreleased data, or work with data already submitted by you. You do not need to login if you are here just to browse through public holdings">Login</a></span></div>
|
|
<div id="content">
|
|
|
|
<h1>Download GEO data</h1>
|
|
|
|
<p>
|
|
All the data in GEO can be downloaded in a variety of formats using a variety of mechanisms.
|
|
The following information lists download options and formats.
|
|
</p>
|
|
|
|
<h2>Download original GEO records <span class="toggle">Expand all</span></h2>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Links on Series records</a></h4>
|
|
<div>
|
|
<p>
|
|
Links to experiment family downloads in various formats and supplementary files
|
|
are provided at the foot of each GEO Series record.
|
|
These files are compressed using gzip (.gz or .tgz extension).
|
|
To unzip and read these files, please use a utility such as WinZip or
|
|
<a href="http://www.7-zip.org/">7-Zip</a>.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">FTP download</a></h4>
|
|
<div>
|
|
<p>
|
|
All GEO records and raw data files are freely available for bulk download from our
|
|
<a href="ftp://ftp.ncbi.nlm.nih.gov/geo/">FTP site</a>. Please see our
|
|
<a href="ftp://ftp.ncbi.nlm.nih.gov/geo/README.txt">README</a>
|
|
for details on directory structure and file formats.
|
|
However, GEO now holds such large numbers of submissions that some parent directories
|
|
can no longer be accessed using web browsers due to time-out errors.
|
|
In such cases it is necessary to bypass the parent directory and go directly to the target directory,
|
|
e.g. for Series GSE1000:
|
|
</p>
|
|
<p>
|
|
<a href="ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE1nnn/GSE1000/matrix/">
|
|
ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE1nnn/GSE1000/matrix/
|
|
</a>
|
|
</p>
|
|
<p>
|
|
Note that most files on the FTP site are compressed using gzip (.gz or .tgz extension).
|
|
To unzip and read these files, please use a utility such as WinZip or
|
|
<a href="http://www.7-zip.org/">7-Zip</a>.
|
|
Alternatively, if you have UNIX, use the tar and gunzip commands to extract the files, e.g.,<br />
|
|
</p>
|
|
<em>Command line:</em>
|
|
<code>
|
|
$ tar -xf GSExxxx_RAW.tar<br />
|
|
$ gunzip *gz
|
|
</code>
|
|
<p>
|
|
More general information about accessing NCBI's FTP server and optimizing bulk FTP transfers is provided
|
|
<a href="ftp://ftp.ncbi.nlm.nih.gov/pub/README.ftp">here</a>.
|
|
</p>
|
|
<p>
|
|
If you plan to perform a very large number or volume of downloads, you might consider high-throughput file transfer
|
|
using Aspera Connect - please contact us at <a href="mailto:geo@ncbi.nlm.nih.gov?subject=Aspera instructions">geo@ncbi.nlm.nih.gov</a> for details.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Accession Display Bar</a></h4>
|
|
<div>
|
|
<p>
|
|
The <a href="/geo/query/acc.cgi">Accession Display bar</a> is found at the top of each GEO record
|
|
and can be used to download or view complete or partial records, or related Platform,
|
|
Sample and Series records. The <em>Scope</em> feature allows display of a single accession number (Self)
|
|
or any (Platform, Sample, or Series) or all (Family) records related to that accession.
|
|
<em>Amount</em> dictates the quantity of data displayed, with choices including metadata only (Brief),
|
|
metadata and the first 20 rows of the data table (Quick), data table only (Data),
|
|
or full metadata/data table records (Full).
|
|
<em>Format</em> controls whether records are displayed in HTML, SOFT (plain text) or MINiML (XML) format.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Construct a URL</a></h4>
|
|
<div>
|
|
<p>
|
|
An alternative to using the Accession Display Bar described above is to construct a URL to retrieve data.
|
|
URLs are formatted as in the following example:
|
|
</p>
|
|
<p>
|
|
<a href="/geo/query/acc.cgi?acc=gpl96&targ=self&view=brief&form=text">
|
|
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?<b>acc</b>=gpl96&<b>targ</b>=self&<b>view</b>=brief&<b>form</b>=text
|
|
</a>
|
|
</p>
|
|
<p>
|
|
This URL will retrieve a text file containing the 'brief' view of accession GPL96. <br />
|
|
</p>
|
|
<p class="last">
|
|
The possible values for each component are:
|
|
</p>
|
|
<ul class="info-list">
|
|
<li>
|
|
<span>
|
|
<em>acc</em> = a valid GEO accession i.e., gplxxx, gsmxxx or gsexxx
|
|
</span>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
<em>targ</em> = self, gsm, gpl, gse or all
|
|
</span>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
<em>view</em> = brief, quick, data or full
|
|
</span>
|
|
</li>
|
|
<li>
|
|
<span>
|
|
<em>form</em> = text, html or xml
|
|
</span>
|
|
</li>
|
|
</ul>
|
|
|
|
<p>
|
|
Note that your browser may time-out when html format is selected for particularly large retrievals.
|
|
Alternatively, if you have perl, you can use this mechanism to retrieve data as follows:
|
|
</p>
|
|
|
|
<code>
|
|
$ perl -MLWP::Simple -e "getprint 'https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM313800&targ=self&view=full&form=text'"
|
|
</code>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Programmatic access</a></h4>
|
|
<div>
|
|
<p>
|
|
GEO records metadata can be programmatically accessed and retrieved using a suite of programs called the
|
|
Entrez Programming Utilities (E-Utils), see <a href="/geo/info/geo_paccess.html">more information...</a>
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Entrez GEO DataSets query downloads</a></h4>
|
|
<div>
|
|
<p>
|
|
All original records can be searched and retrieved using the <a href="/gds/">Entrez GEO DataSets</a> interface.
|
|
Results can be exported by setting the tool bar at the top of the page to 'Send to: File'.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<h2>Download curated DataSets and Profiles <span class="toggle">Expand all</span></h2>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Links on DataSet records</a></h4>
|
|
<div>
|
|
<p>
|
|
Links to <em>DataSet SOFT files</em> are available under the 'download' button on each DataSet record.
|
|
These files are compressed using gzip (.gz or .tgz extension).
|
|
To unzip and read these files, please use a utility such as WinZip or <a href="http://www.7-zip.org/">7-Zip</a>.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">FTP download</a></h4>
|
|
<div>
|
|
<p>
|
|
All GEO DataSet records are freely available for bulk download from our
|
|
<a href="ftp://ftp.ncbi.nlm.nih.gov/geo/datasets/">FTP site</a>.
|
|
These files are compressed using gzip (.gz extension).
|
|
To unzip and read these files, please use a utility such as WinZip or <a href="http://www.7-zip.org/">7-Zip</a>.
|
|
Alternatively, if you have UNIX, use the gunzip command to uncompress the files, e.g.,
|
|
</p>
|
|
<em>Command line:</em>
|
|
<code>
|
|
$ gunzip *gz
|
|
</code>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Programmatic access</a></h4>
|
|
<div>
|
|
<p>
|
|
GEO records metadata can be programmatically accessed and retrieved using a suite of programs called the
|
|
Entrez Programming Utilities (E-Utils), see <a href="/geo/info/geo_paccess.html">more information...</a>
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Profile values downloads</a></h4>
|
|
<div>
|
|
<p>
|
|
Use the 'Download profile data' button at the top of <a href="/geoprofiles/">Entrez GEO Profiles</a> retrieval pages
|
|
to download the expression values of genes found in your query.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
<div class="jig-ncbiaccordion">
|
|
<h4><a href="#">Entrez GEO DataSets and Entrez GEO Profiles query downloads</a></h4>
|
|
<div>
|
|
<p>
|
|
It is possible to export <a href="/gds/">Entrez GEO DataSets</a> and
|
|
<a href="/geoprofiles/">Entrez GEO Profiles</a> document summaries by setting
|
|
the tool bar at the head of the page to 'Send to: File'.
|
|
</p>
|
|
</div>
|
|
</div>
|
|
|
|
</div>
|
|
</div>
|
|
<div id="last_mod">
|
|
Last modified: July 16, 2024</div>
|
|
<div id="footer">
|
|
<span class="helpbar">|<a href="https://www.nlm.nih.gov"> NLM </a>|<a href="https://www.nih.gov"> NIH </a>|<a href="mailto:geo@ncbi.nlm.nih.gov"> Email GEO </a>|<a href="/geo/info/disclaimer.html"> Disclaimer </a>|<a href="https://www.nlm.nih.gov/accessibility.html"> Accessibility </a>|<a href="https://www.hhs.gov/vulnerability-disclosure-policy/index.html"> HHS Vulnerability Disclosure </a>|
|
|
</span>
|
|
</div>
|
|
</div>
|
|
<script type="text/javascript" src="https://www.ncbi.nlm.nih.gov/portal/portal3rc.fcgi/rlib/js/InstrumentOmnitureBaseJS/InstrumentNCBIBaseJS/InstrumentPageStarterJS.js"></script>
|
|
</body>
|
|
</html>
|