nih-gov/www.ncbi.nlm.nih.gov/Sitemap/ResourceGuide.html

9107 lines
343 KiB
HTML
Raw Permalink Blame History

<html>
<head>
<title>NCBI Resource Guide</title>
<!--<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">-->
<meta http-equiv="Refresh" content="0; URL=/guide/sitemap/">
<meta name="author" content="NCBI_user_services">
<meta name="keywords" content="national center for biotechnology
information, ncbi, national library of medicine, nlm, national institutes of
health, nih, database, archive, pubmed, bookshelf, bioinformatics,
biomedicine, chromosome, genetics, genome, molecular modeling, taxonomy,
homology, phylogeny, malaria, sky/cgh, refseq, retrovirus, map viewer,
science primer, site map, sitemap, glossary, dictionary, definition, resource
descriptions, summary, guide, overview, list, listing, directory, index,
catalog,
education, teach, teacher, student, introduction, abbreviation, acronym">
<meta name="description" content="The NCBI Resource Guide is a catalog of
resources that allows browsing either by category or by an alphabetical
listing.">
<style TYPE="text/css">
A {color: #336699; text-decoration:none;}
A:hover {text-decoration:underline;}
I { font-family: Arial, Helvetica, sans-serif;}
P { margin-top: 4pt; margin-bottom: 6pt}
.BAR {font-family: Arial, Helvetica, sans-serif; font-size: 10pt; color:
#FFFFFF;}
.HEAD1 {font-family: Arial, Helvetica, sans-serif; font-size: 14pt; color:
#336699; font-weight: bold}
.HEAD2 {font-family: Arial, Helvetica, sans-serif; font-size: 12pt; color:
#336699; font-weight: bold}
.HEAD3 {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#336699; font-weight: bold}
.HEAD3a {font-family: Arial, Helvetica, sans-serif; font-size: 11pt;
color:
#FFFFFF; font-weight: bold}
.HEAD3b {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#336699}
.HEAD4 {font-family: Arial, Helvetica, sans-serif; font-size: 10pt; color:
#336699; font-weight: bold}
.HEAD4a {font-family: Arial, Helvetica, sans-serif; font-size: 10pt;
color:
#FFFFFF; font-weight: bold}
.H1 {font-family: Arial, Helvetica, sans-serif; font-size: 14pt; color:
#336699; font-weight: bold}
.H2 {font-family: Arial, Helvetica, sans-serif; font-size: 12pt; color:
#336699; font-weight: bold}
.H2a {font-family: Arial, Helvetica, sans-serif; font-size: 12pt; color:
#FFFFFF; font-weight: bold}
.H3 {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#336699; font-weight: bold}
.H3a {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#FFFFFF; font-weight: bold}
.H3b {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#336699}
.H4 {font-family: Arial, Helvetica, sans-serif; font-size: 10pt; color:
#336699; font-weight: bold}
.H4a {font-family: Arial, Helvetica, sans-serif; font-size: 10pt; color:
#FFFFFF; font-weight: bold}
.LARGE {font-family: Arial, Helvetica, sans-serif; font-size: 14pt; color:
#000000;}
.MEDIUM {font-family: Arial, Helvetica, sans-serif; font-size: 12pt; color:
#000000;}
.NORMAL {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#000000;}
.SMALL {font-family: Arial, Helvetica, sans-serif; font-size: 10pt; color:
#000000;}
.TEXT {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#000000;}
.TEXT2 {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#336699;}
.TEXT2a {font-family: Arial, Helvetica, sans-serif; font-size: 11pt; color:
#FFFFFF;}
</style>
</head>
<body text="#000000" bgcolor="#FFFFFF" link="#CC6600" vlink="#CC6600">
<a NAME="Top"></a>
<!-- ========== PAGE HEADER ============= -->
<table BORDER="0" CELLSPACING=0 CELLPADDING=0 WIDTH="600" >
<tr>
<td WIDTH="140"><a href="http://www.ncbi.nlm.nih.gov/"><img SRC="left.GIF"
BORDER="0" height=45 width=130></a></td>
<td VALIGN=BOTTOM WIDTH="460" class="H1">NCBI Resource Guide</td>
</tr>
</table>
<!-- ============= PAGE_STRUCTURE ==================
HORIZONTAL_NAVIGATION_BAR
TABLE_OF_CONTENTS_AND_ALPHA_TABLE
LEFT_SIDE_OF_TABLE
RIGHT_SIDE_OF_TABLE
LEGEND_TABLE
ABOUT_NCBI
WhatsNew
NCBINews
Programs
Fellows
OrganizationalStructure
Contact
EmailLists
GENBANK
GENBANK OVERVIEW
GENBANK SUBMISSIONS
SPECIAL SUBMISSIONS TO GENBANK
OTHER (NON-GENBANK) TYPES OF DATA SUBMISSIONS
COLLABORATION
FTP GENBANK
MOLECULAR DATABASES
MOLECULAR DATABASES: NUCLEOTIDES
MOLECULAR DATABASES: PROTEINS
MOLECULAR DATABASES: STRUCTURES
MOLECULAR DATABASES: GENES
MOLECULAR DATABASES: EXPRESSION
MOLECULAR DATABASES: TAXONOMY
LITERATURE
GENOMES_AND_MAPS
: MULTIPLE ORGANISMS
: HUMAN
: GUIDE
: CHROMOSOMES
: SEQUENCES
: GENES
: BLAST
: CLONES
: MAPS
: MAPPED MARKERS
: CYTOGENETICS
: GENE EXPRESSION
: GENETIC VARIATION
: DISORDERS
: CANCER RESEARCH
: FTP
: MOUSE
MOUSE GENOME SUB-CATEGORY: GUIDE
MOUSE GENOME SUB-CATEGORY: CHROMOSOMES
MOUSE GENOME SUB-CATEGORY: SEQUENCES
MOUSE GENOME SUB-CATEGORY: GENES
MOUSE GENOME SUB-CATEGORY: CLONES
MOUSE GENOME SUB-CATEGORY: MAPS
MOUSE GENOME SUB-CATEGORY: CYTOGENETICS
MOUSE GENOME SUB-CATEGORY: BLAST
MOUSE GENOME SUB-CATEGORY: FTP
: RAT
: COW
: ZEBRAFISH
: DROSOPHILA
: NEMATODE
: PLANTS
: YEAST
: MALARIA
: MICROBIAL GENOMES
: VIRUSES
: VIROIDS
: PLASMIDS
: EUKARYOTIC_ORGANELLES
TOOLS
: TEXT SEARCHING
: SEQUENCE SIMILARITY SEARCHING
: NUCLEOTIDE SEQUENCE ANALYSIS
: PROTEIN SEQUENCE ANALYSIS
: 3-D STRUCTURE DISPLAY AND SIMILARITY SEARCHING
: GENOME ANALYSIS
: GENE EXPRESSION
RESEARCH
SOFTWARE ENGINEERING
EDUCATION
: NEWS
: BOOKS
: GLOSSARIES
: TUTORIALS
: COURSES
: ADDITIONAL RESOURCES
FTP_SITE
CATEGORY WITHIN FTP_SITE: DATABASES
CATEGORY WITHIN FTP_SITE: GENOMES
HUMAN_GENOME_PROJECT_FTP_DATA
OTHER_GENOMES_FTP_DATA
CATEGORY WITHIN FTP_SITE: SOFTWARE
Revised
================ END_PAGE_STRUCTURE ============ -->
<!-- ================= HORIZONTAL_NAVIGATION_BAR ================= -->
<table CLASS="BAR" border="0" width="100%" cellspacing="0" cellpadding="3"
bgcolor="#003366">
<tr CLASS="BAR" align="CENTER">
<td width="16%"><a href="/entrez/" class="BAR">PubMed</a></td>
<td width="16%"><a href="/Entrez/" class="BAR">Entrez</a></td>
<td width="16%"><a href="/BLAST/" class="BAR">BLAST</a></td>
<td width="16%"><a href="/entrez/query.fcgi?db=OMIM"
class="BAR">OMIM</a></td>
<td width="16%"><a href="/Taxonomy/taxonomyhome.html"
class="BAR">Taxonomy</a></td>
<td width="16%"><a href="/Structure/" class="BAR">Structure</a></td>
</tr>
</table>
<!-- ======================= the contents ========================== -->
<!-- ================ INTRODUCTORY_NOTE ================= -->
<p>
<table BGCOLOR="#FFFFFF" BORDER="0" CELLSPACING=0 CELLPADDING=3 WIDTH="100%">
<tr>
<td align="center" class="TEXT2">
<i>Each link in this <b>Resource Guide</b> leads to a <b>brief description of
the
resource</b> on this page, then to the resource itself. A graphical <a
href="index.html"><b><FONT color="CD5555">Site Map</FONT></b></a> and an <a
href="AlphaList.html"><b><FONT color="CD5555">Alphabetical Quicklinks
Table</FONT></b></a> provide direct links to resources and bypass the
descriptions.</i>
</td>
</tr>
</table>
</p>
<!-- OLD TEXT: Each link in this <b>Resource Guide</b> leads to a <b>brief
description of the resource</b> on this page, then to the resource itself. This
guide allows browsing either by category or by an alphabetical listing.
Separate
files provide a graphical <a href="index.html"><b><FONT color="CD5555">Site
Map</FONT></b></a> and an <a href="AlphaList.html"><b><FONT
color="CD5555">Alphabetical Quicklinks Table</FONT></b></a>, which have
<b>direct
links</b> to resources and bypasses the descriptions -->
<!-- ===========TABLE_OF_CONTENTS_AND_ALPHA_TABLE============== -->
<table WIDTH="100%" BORDER="0" CELLSPACING=0 CELLPADDING=0>
<!-- ============== LEFT_SIDE_OF_TABLE================ -->
<tr>
<td WIDTH="40%" BGCOLOR="#FFFFFF" valign="top">
<img SRC="spacer10.GIF" height="3" width="10" border=0><BR>
<table WIDTH="100%" BGCOLOR="#FFFFFF" BORDER="0" CELLSPACING=0 CELLPADDING=0>
<tr><td BGCOLOR="#e0eee0" CLASS="H2">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<img SRC="spacer10.GIF" height="5" width="5" border=0>
RESOURCES BY CATEGORY<BR>
<img SRC="spacer10.GIF" height="5" width="10" border=0>
</td>
</tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="7" width="10" border=0><BR>
<b><a href="#AboutNCBI">About NCBI</a></b></td>
</td>
</tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#Programs">programs and services</a>,
<a href="#Contact">contact information</a>,
<a href="#NCBIHandbook">NCBI handbook</a>,
<a href="#News"><b>news</b></a> (<a href="#WhatsNew">what's new</a>, <a
href="#NCBINews">NCBI News</a>, <a href="#EmailLists">announcements
e-mail lists</a>, <a href="#RSSfeeds">RSS feeds</a>),
<a href="#ExhibitSchedule">exhibit schedule</a>,
<a href="#Fellows">postdoctoral fellowships</a>,
<a href="#OrganizationalStructure">organizational structure</a>,
<a href="#Statistics">resource statistics</a>,
<a href="#SiteSearch">site search</a></td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#GenBank">GenBank</a></b></td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#Overview">overview</a>, <a href="#Submissions">submit
sequences</a>, <a href="#SubmitGenomes">submit genomes</a>, <a
href="#SampleRecord">sample record</a>, <a href="#GenBankDivisions">GenBank
divisions</a>, <a href="#GenBankStatistics">statistics</a>, <a
href="#GenBankReleaseNotes">release notes</a>, <a
href="#Collaboration">international collaboration</a>, <a href="#FTPGenBank">FTP
GenBank</a></td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Databases">Molecular Databases</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#Nucleotides">nucleotides</a>, <a
href="#Proteins">proteins</a>, <a href="#Structures">structures</a>, <a
href="#Genes">genes</a>, <a href="#Expression">gene expression</a>, <a
href="#Taxonomy">taxonomy</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Literature">Literature Databases</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#PubMed">PubMed</a>, <a
href="#PubMedCentral">PubMedCentral</a>, <a href="#Journals">Journals</a>, <a
href="#OMIM">OMIM</a>, <a href="#Books">Books</a>, <a
href="#CitationMatcher">Citation Matcher</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Genomes">Genomes and Maps</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote>
<a href="#MultipleOrganisms">organism collections</a> (including
<a href="#EntrezGenome">Entrez Genome</a>,
<a href="#EntrezGenomeProject">Entrez Genome Project</a>,
<a href="#MapViewer">Map Viewer</a>,
<a href="#GenomesEntrezGene">Entrez Gene</a>,
<a href="#GenomesUniGene">UniGene</a>,
<a href="#GenomesHomoloGene">HomoloGene</a>, and
<a href="#COGs">COGs</a>), and organism-specific resources, such as:
<a href="#HumanGenome">human</a>,
<a href="#MouseGenome">mouse</a>,
<a href="#RatGenome">rat</a>,
<a href="#CowGenome">cow</a>,
<a href="#ZebrafishGenome">zebrafish</a>,
<a href="#DrosophilaGenome"><i>Drosophila</i></a>,
<a href="#NematodeGenome">nematode</a>,
<a href="#PlantGenomes">plant genomes</a>,
<a href="#YeastGenome">yeast</a>,
<a href="#MalariaGenome">malaria</a>,
<a href="#MicrobialGenomes">microbial genomes</a>,
<a href="#ViralGenomes">viruses</a>,
<a href="#ViroidGenomes">viroids</a>,
<a href="#Plasmids">plasmids</a>,
<a href="#EukaryoticOrganelles">eukaryotic organelles</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Tools">Tools</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#Entrez">Entrez</a>, <a href="#LinkOut">LinkOut</a>,
<a href="#MyNCBI">My NCBI</a>, <a href="#BLAST">BLAST</a>,
<a href="#NucleotideSequenceAnalysis">nucleotide sequence analysis</a>,
<a href="#ProteinSequenceAnalysis">protein sequence analysis</a>,
<a href="#StructureTools">3-D structure display and similarity searching</a>,
<a href="#GenomeAnalysisTools">genome analysis</a>,
<a href="#GeneExpressionTools">gene expression</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Research">Research at NCBI</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#CBB">Computational Biology Branch (CBB)</a>, <a
href="#SeniorInvestigatorsInPubMed">senior investigators in PubMed</a>, <a
href="#SeminarSchedule">seminar schedule</a>, <a
href="#PostdoctoralFellows">postdoctoral fellowships</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#SoftwareEngineering">Software Engineering</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#IEB">IEB home page</a>, <a href="#ToolBox">NCBI
ToolBox</a>,
<a href="#IEB_Research">R&D projects</a>, <a href="#ASN.1">ASN.1</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#Education">Education</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote>
<a href="#News">news</a>,
<a href="#SciencePrimer">science primer</a>,
<a href="#EducationBooks">books</a>,
<a href="#Glossaries">glossaries</a>,
<a href="#Tutorials">tutorials</a>,
<a href="#Courses">courses</a>,
<a href="#AdditionalResources">additional resources</a>
</td></tr>
<tr><td BGCOLOR="#FFFFFF" class="H3">
<img SRC="spacer10.GIF" height="5" width="10" border=0><BR>
<b><a href="#FTPSite">FTP Site</a></b><td></tr>
<tr><td BGCOLOR="#FFFFFF" CLASS="TEXT">
<blockquote><a href="#FTPDatabases">download databases</a>, <a
href="#FTP_Genomes">genomes</a>, and <a href="#FTPSoftware">software</a>, <a
href="#FTP_ToolBox">NCBI Software ToolBox</a>
</td></tr>
</table>
</td>
<!-- ============== END_LEFT_SIDE_OF_TABLE================ -->
<!-- ============== RIGHT_SIDE_OF_TABLE=================== -->
<td WIDTH="60%" BGCOLOR="#FFFFFF" valign="top">
<table WIDTH="100%" BGCOLOR="#FFFFFF" BORDER="0" CELLSPACING=3 CELLPADDING=4>
<!-- ================ ROW 0 (HEADER) ================ -->
<tr>
<td BGCOLOR="#e0eee0" COLSPAN="3" ALIGN="CENTER">
<FONT CLASS="H2"><b>ALPHABETICAL INDEX</b></FONT>
<br><FONT CLASS="TEXT2"><b>with links to resource descriptions</b></FONT>
<br><FONT CLASS="TEXT2"><i>(To bypass descriptions, use the <a
href="AlphaList.html"><FONT color="CD5555"><b>Alphabetical Quicklinks
Table</b></FONT></a>.)</i></FONT></td>
</tr>
<!-- ================ ROW 1 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#AboutNCBI">About NCBI</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SampleRecord">GenBank sample record</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#PlantGenomes">Plant Genomes</a></td>
</tr>
<!-- ================ ROW 2 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#EmailLists">Announcements</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Genes">Genes</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Proteins">Protein Sequences</a></td>
</tr>
<!-- ================ ROW 3 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ASN.1">ASN.1</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#GenesAndDisease">Genes and Disease</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#PubChem">PubChem</a><!-- img SRC="new.gif" height="12" width="31"
border=0 --></td>
</tr>
<!-- ================ ROW 4 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#BankIt">BankIt</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Genomes">Genomes</a> (<a href="#EntrezGenome">data</a>, <a
href="#EntrezGenomeProject">projects</a>, <a
href="#SubmitGenomes">submissions</a>)</td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#PubMed">PubMed</a></td>
</tr>
<!-- ================ ROW 5 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#BLAST">BLAST</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#GENSAT">GENSAT</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#PubMedCentral">PubMed Central</a></td>
</tr>
<!-- ================ ROW 6 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#BLink">BLink</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#GEO">GEO</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#RefSeq">RefSeq</a></td>
</tr>
<!-- ================ ROW 7 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Books">Books</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Glossaries">Glossaries</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Research">Research at NCBI</a></td>
</tr>
<!-- ================ ROW 8 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CancerChromosomes">Cancer Chromosomes</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#NCBIHandbook">Handbook</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ViralGenomes">Retroviruses</a></td>
</tr>
<!-- ================ ROW 9 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CCDS">CCDS</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#HIVInteractions">HIV Interactions</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SAGEmap">SAGEmap</a></td>
</tr>
<!-- ================ ROW 10 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CDART">CDART</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#HTG">HTGs</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SciencePrimer">Science Primer</a></td>
</tr>
<!-- ================ ROW 11 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CDD">CDD</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#HomoloGene">HomoloGene</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SeminarSchedule">Seminars</a></td>
</tr>
<!-- ================ ROW 12 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CGAP">CGAP</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#HumanGenome">Human Genome Resources</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Sequin">Sequin</a></td>
</tr>
<!-- ================ ROW 13 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Clones">Clones</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#HumanMouseMap">Human-Mouse Homology Maps</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ShortReadArchive">Short Read<br> Archive</a>
<img SRC="new.gif" height="12" width="31" border=0></td>
</tr>
<!-- ================ ROW 14 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Cn3D">Cn3D</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Journals">Journals</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="/entrez/query.fcgi?db=ncbisearch">Site Search</a></td>
</tr>
<!-- ================ ROW 15 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CoffeeBreak">Coffee Break</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#LinkOut">LinkOut</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SKY_CGH">SKY/M-FISH & CGH Database</a></td>
</tr>
<!-- ================ ROW 16 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#COGs">COGs</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MalariaGenome">Malaria</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#SoftwareEngineering">Software Engineering</a></td>
</tr>
<!-- ================ ROW 17 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#CBB">Computational Biology Branch</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MapViewer">Map Viewer</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Splign">Splign</a></td>
</tr>
<!-- ================ ROW 18 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Submissions">Data Submissions</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MeSH">MeSH</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Statistics">Statistics</a></td>
</tr>
<!-- ================ ROW 19 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#dbEST">dbEST</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MGC">MGC</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Structures">Structures</a></td>
</tr>
<!-- ================ ROW 20 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#dbGSS">dbGSS</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MicrobialGenomes">Microbial Genomes</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Submissions">Submit Data</a></td>
</tr>
<!-- ================ ROW 21 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#dbMHC">dbMHC</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MMDB">MMDB</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Taxonomy">Taxonomy</a></td>
</tr>
<!-- ================ ROW 22 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#dbSNP">dbSNP</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ModelMaker">Model Maker</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Tools">Tools</a></td>
</tr>
<!-- ================ ROW 23 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#dbSTS">dbSTS</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MutationDatabases">Mutation Databases</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#TPA">TPA</a></td>
</tr>
<!-- ================ ROW 24 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Education">Education</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#MyNCBI">My NCBI</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#TraceArchive">Trace Archives</a></td>
</tr>
<!-- ================ ROW 25 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ePCR">e-PCR</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="http://www.ncbi.nlm.nih.gov/">NCBI Home</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#UniGene">UniGene</a></td>
</tr>
<!-- ================ ROW 26 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Entrez">Entrez</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#NCBINews">NCBI News</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#UniSTS">UniSTS</a></td>
</tr>
<!-- ================ ROW 27 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#EntrezUtilities">Entrez Utilities</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Nucleotides">Nucleotide Sequences</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#VAST">VAST</a></td>
</tr>
<!-- ================ ROW 28 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#Expression">Expression</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#OMIM">OMIM</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#VecScreen">VecScreen</a></td>
</tr>
<!-- ================ ROW 29 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#FTPSite">FTP</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#OMSSA">OMSSA</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ViralGenomes">Viruses</a></td>
</tr>
<!-- ================ ROW 30 ================ -->
<tr>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#GenBank">GenBank</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#ORFFinder">ORF Finder</a></td>
<td WIDTH="33%" BGCOLOR="#e0eeee" CLASS="TEXT">
<a href="#WGS">WGS</a></td>
</tr>
<!-- ================ ROW 31 (SPACER ROW) ================ -->
<!-- tr>
<td BGCOLOR="#FFFFFF" COLSPAN="3" ALIGN="CENTER">
<IMG SRC="spacer10.GIF" width="15" height=1 border=0>
</td>
</tr -->
<!-- tr>
<td BGCOLOR="#FFFFFF" COLSPAN="3" ALIGN="CENTER">
<FONT size="2" color="0033CC"><i>Questions about NCBI resources to</i> <a
href="mailto:info@ncbi.nlm.nih.gov">info@ncbi.nlm.nih.gov</a></FONT><br>
<FONT size="2" color="0033CC"><i>Comments about site map to Renata Geer</i> <a
href="mailto:renata@ncbi.nlm.nih.gov">renata@ncbi.nlm.nih.gov</a></FONT></td>
</tr -->
</table>
</td></tr>
<!-- ============== END_RIGHT_SIDE_OF_TABLE=============== -->
</table>
<p></p>
<!-- =======END_TABLE_OF_CONTENTS_AND_ALPHA_TABLE================ -->
<!-- ================= "NEW" STARBURST LEGEND ===================== -->
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="10%" BGCOLOR="#FFFFFF">&nbsp;</td>
<td WIDTH="80%" BGCOLOR="#FFFFFF" ALIGN="center" CLASS="TEXT2"><img
SRC="new.gif"
height="12" width="31" border=0> &nbsp;<FONT size="-1"><i>indicates a resource
which
has become available in the last 12 months.</i></FONT></td>
<td WIDTH="10%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
</p>
<!-- =============== END "NEW" STARBURST LEGEND ==================== -->
<!-- ==========================ABOUT_NCBI========================== -->
<a NAME="AboutNCBI"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" class="H3a">About NCBI</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/About/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/">About NCBI</a> - The science behind our
resources. An introduction for researchers, educators and the public. Includes
a <a
href="/About/primer/index.html">Science Primer</a>, with plain language
introductions to bioinformatics, genome mapping, molecular modeling, SNPs, ESTs,
microarray technology, molecular genetics, pharmacogenomics, and
phylogenetics.</td>
</tr>
</table>
<a NAME="Programs"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/glance/programs.html">Programs and Services</a>
-
basic research, databases and software, outreach and education</td>
</tr>
</table>
<a NAME="Contact"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/glance/contact_info.html">Contact
Information</a> -
postal address, phone, e-mail addresses for various services</td>
</tr>
</table>
<a NAME="ExhibitSchedule"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/outreach/exhibitsched.html">Exhibit
Schedule</a> -
NCBI exhibits at upcoming conferences</td>
</tr>
</table>
<a NAME="NCBIHandbook"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?call=bv.View..ShowTOC&rid=handbook.TOC&depth=2">NCBI
Handbook</a> - an online book, written by NCBI staff, that discusses
the many resources available at NCBI. Each chapter is
devoted to one service; after a brief overview on using
the resource, there is an account of how the resource works,
including topics such as how data are included in a database,
database design, query processing, and how the different
resources relate to each other.</td>
</tr>
</table>
<a NAME="OrganizationalStructure"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/glance/organizational.html">Organizational
Structure</a> - functions of the three NCBI branches: Computational Biology
Branch
(CBB), Information Engineering Branch (IEB), and Information Resources Branch
(IRB)</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/About/glance/science_counselors.html">Board of
Scientific
Counselors</a> - advises the NIH Director and the Deputy Director for Intramural
Research; the NLM Director, and the NCBI Director about the intramural research
and
development programs of the NCBI.</td>
</tr>
</table>
<a NAME="Fellows"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="Summary/postdoc.html">Postdoctoral Fellowships</a> -
general information, application procedure</td>
</tr>
</table>
<a NAME="Statistics"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="Summary/statistics.html">Statistics for NCBI
Resources</a>
-
A page listing statistics that are available for selected NCBI resources,
including
number of records present in various databases, number of genomes available at
NCBI
and statistics for the individual genomes, and server usage.
</td>
</tr>
</table>
<a NAME="SiteSearch"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=ncbisearch">Site Search</a> -
Search the NCBI web site and display results in various formats. The default
Homepage view sorts NCBI pages based on the number of other NCBI pages that link
to
them. The NCBI Site Search function is part of the Entrez system (described <a
href="#Entrez">below</a>). Therefore, the search features described in the <a
href="/bookshelf/br.fcgi?book=helpentrez&part=EntrezHelp">Entrez help document</a> also
apply to
the site search function.
</td>
</tr>
</table>
<a NAME="News"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">News and Announcements</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="WhatsNew"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/About/whatsnew.html">What's New</a> - recently
released
resources and enhancements to existing resources.</li></ul></td>
</tr>
</table>
<a NAME="NCBINews"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/bookshelf/br.fcgi?book=newsncbi">NCBI News</a> - announcements
about new resources, enhancements to existing resources, staff publications,
tutorials, FAQs.</li></ul></td>
</tr>
</table>
<a NAME="EmailLists"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="Summary/email_lists.html">NCBI Announcements Email
Lists</a> -
Receive announcements about changes and updates to a variety of NCBI services.
In
addition to a general NCBI-announce list, topic-specific e-mail lists are
available
for BLAST, GenBank, dbSNP, Genomes, LinkOut, RefSeq, Sequin, and Entrez
Utilities
(for making WWW Links to Entrez). Follow the link to the NCBI Announcements Email
Lists page to see a complete list of available topics. Information on <a
href="Summary/email_lists.html">how to subscribe</a> is provided.</li></ul>
</td>
</tr>
</table>
<a NAME="RSSfeeds"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/feed/">NCBI RSS Feeds</a> -
Receive announcements about various NCBI services using an RSS (Real simple syndication)
feed reader. RSS feeds are available for resources such as Bookshelf, HomoloGene,
PubMed Central, PubMed New and Noteworthy, Probe Database, and UniGene.
Follow the link to the NCBI RSS Feeds page to see a complete list of available topics.
Additional information about RSS is provided in a short series of
<a href="/feed/styles/help.html">FAQs</a>.</li></ul>
</td>
</tr>
</table>
<!-- =======================END_ABOUT_NCBI========================== -->
<!-- =======================GENBANK=========================== -->
<a NAME="GenBank"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" class="H3a">GenBank</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/Genbank/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<table BORDER="0" WIDTH="98%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#Overview">General Information</a>
(<a href="#SampleRecord">sample record</a>,
<a href="#GenBankReleaseNotes">release notes</a>,
<a href="#GenBankDivisions">GenBank divisions</a>,
<a href="#GenBankStatistics">statistics</a>), &nbsp;
<a href="#Submissions">Submissions</a>
(<a href="#SpecialSubmissionsToGenBank">general</a>,
<a href="#GenBankReleaseNotes">special categories</a>,
<a href="#SubmittingOtherTypesOfData">other data types</a>), &nbsp;
<a href="#Collaboration">International Collaboration</a>, &nbsp;
<a href="#FTPGenBank">FTP GenBank</a>
</td>
<td CLASS="TEXT" WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<br>
<!-- ================GENBANK OVERVIEW================ -->
<a NAME="Overview"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">General Information</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<td CLASS="TEXT"><a href="/Genbank/index.html">What is GenBank?</a> - a
database of nucleotide sequences from >160,000 organisms. Records that are
annotated with coding region (CDS) features also include amino acid
translations.
GenBank belongs to an <b>international collaboration</b> of sequence databases
(described <a href="#Collaboration">below</a>), which also includes EMBL and
DDBJ.
&nbsp;GenBank is <b>updated daily</b> in NCBI search systems, and a <b>full
release</b> is issued on the FTP site approximately the 15th of every February,
April, June, August, October, and December. It contains all the data present in
GenBank as of the cutoff date specified in the <b>release notes</b> (described
<a
href="#GenBankReleaseNotes">below</a>). The FTP site also provides daily
cumulative
an non-cumulative update files (more about the FTP site <a
href="#FTP_GenBank">below</a>).</td>
</table>
<a NAME="SampleRecord"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="samplerecord.html">Sample Record</a> - detailed
description of each field in a GenBank record. <br>Includes, for example,
information about accession number formats, sequence identifiers (GI number and
accession.version), a listing of GenBank divisions, and more. Describes some
commonly annotated biological features, such as CDS, and provides links to
documents
that list and define the complete set of biological features that can be
annotated
on sequence records. Includes a link to a <a
href="/entrez/sutils/girevhist.cgi">sequence revision history tool</a> that can
be
used to track changes that have occurred to the sequence data in a record.
&nbsp;Also lists the Entrez search field(s) that can be used to search each part
of
a sequence record.</td>
</tr>
</table>
<a NAME="GenBankDivisions"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="samplerecord.html#GenBankDivisionB">GenBank
Divisions</a>
- summary of GenBank divisions, including abbreviations, full spellings,
information
about what the GenBank divisions are, and what they are <i>not</i>. (This
information is part of the GenBank sample record, described above.)</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">Access GenBank - through <a
href="/entrez/query.fcgi?db=Nucleotide">Entrez Nucleotides</a>. Search by
accession
number, author name, organism, gene/protein name, and a variety of other text
terms.
Additional information about Entrez is <a href="#Entrez">below</a>. Use <a
href="#BLAST">BLAST</a> for sequence similarity searches against GenBank and
other
databases. An option to download the GenBank full
release and updates via <a href="#FTPGenBank">FTP</a> is also available.</td>
</tr>
</table>
<a NAME="GenBankStatistics"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Genbank/genbankstats.html">Growth Statistics
(graph)</a>
- see also <a href="ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt">Release Notes</a>
sections 2.2.6 (per division statistics), 2.2.7 (per organism statistics), 2.2.8
(growth of GenBank). For statistics on other NCBI databases, please see the page
that summarizes sources of <a href="Summary/statistics.html">Statistics for NCBI
Resources</a>.</td>
</tr>
</table>
<a NAME="GenBankReleaseNotes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt">GenBank
Release
Notes</a> - A document that accompanies each full release (described in "<a
href="#Overview">What is GenBank?</a>", above) of the GenBank database. The release notes describe
the
format and content of the flat files that comprise the release. They also
include
notices of recent and upcoming changes, information about GenBank divisions,
growth
statistics, citing GenBank, and more.<b>
<UL>
<li><a
href="ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt">Current Release Notes</a></li>
<li><a
href="ftp://ftp.ncbi.nih.gov/genbank/release.notes/">Past Release Notes</a></li>
</UL>
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/Taxonomy/Utils/wprintgc.cgi?mode=c">Genetic
Codes</a> -
synopsis of 17 genetic codes; used to ensure correct translation of coding
sequences
in GenBank records.</td>
</tr>
</table>
<a NAME="GenBankBionetNewsgroup"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://www.bio.net/biomail/listinfo/genbankb/">GenBank Bionet
Newsgroup</a> - A moderated list that includes announcements of new GenBank releases, recent and upcoming changes, and discussion among subscribers. For information on how to subscribe by e-mail, see the <a href="Summary/email_lists.html">NCBI Announcements Email Lists</a> page.</td>
</tr>
</table>
<br>
<!-- ================GENBANK SUBMISSIONS================ -->
<a NAME="Submissions"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">GenBank Submissions</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">General Information</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Genbank/submit.html"><b>General information</b>
about
submitting nucleotide sequence data</a> to GenBank, receiving <a
href="/Genbank/submit.html#ref1"><b>accession numbers</b></a>, and making <a
href="/Genbank/submit.html#ref11"><b>updates</b> to records</a>. <a
href="#SpecialSubmissionsToGenBank"><b>Special types of submissions</b></a> to GenBank,
such as genomes, alignments, ESTs, GSSs, HTGs, STSs, and WGS are discussed
below.</li></ul>
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td width="5%">&nbsp;</td>
<td width="90%" BGCOLOR="#FFFFCC" CLASS="TEXT">
<blockquote><i><!-- FONT color="CD5555" -->In addition to GenBank, there are <a
href="#SubmittingOtherTypesOfData"><b>other databases at NCBI</b></a> to which a
variety of data types can be submitted
(<a href="#SubmitTPA">third party annotations (TPA)</a>,
<a href="#SubmitVariation">variation</a>,
<a href="#SubmitExpression">expression</a>,
<a href="#SubmitMHC">MHC data</a>,
<a href="#SubmitSKY_MFISH_CGH">SKY/M-FISH/CGH data</a>,
<a href="#SubmitTraceData">traces</a>).<!-- /FONT --></i></blockquote></li></ul>
<!-- blockquote>
Examples of other types of data include:
<ul>
<li><a href="#SubmitTPA"><i>Third Party Annotations (TPA) for GenBank
records</i></a></li>
<li><a href="#SubmitVariation"><i>Variation</i></a></li>
<li><a href="#SubmitExpression"><i>Expression</i></a></li>
<li><a href="#SubmitMHC"><i>Major Histocompatibility Complex (MHC)
data</i></a></li>
<li><a href="#SubmitSKY_MFISH_CGH"><i>Spectral Karyotyping (SKY), Multiple
Fluorescence In Situ Hybridization (M-FISH) and
Comparative Genomic Hybridization (CGH) data</i></a></li>
<li><a href="#SubmitTraceData"><i>Traces</i></a></i></li>
</ul>
</blockquote -->
</td>
<td width="5%">&nbsp;</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Submission Software Programs</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="BankIt"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/BankIt/">BankIt</a> - WWW submission tool for
one
or few submissions, designed to make the submission process quick and easy.
&nbsp;(BankIt also automatically uses <a href="#VecScreen">VecScreen</a> to
identify
segments of nucleic acid sequence which may be of vector, adapter, or linker
origin
to combat the problem of vector contamination in GenBank.)</li></ul></td>
</tr>
</table>
<a NAME="Sequin"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Sequin/">Sequin</a> - submission software
program
for one or many submissions, long sequences, complete genomes, alignments,
population/phylogenetic/mutation studies. Can be used as a stand-alone
application
or in a TCP/IP-based "network aware" mode, with links to other NCBI resources
and
software such as <a href="#Entrez">Entrez</a>. &nbsp;(Use <a
href="#VecScreen">VecScreen</a> prior to submission). &nbsp;To receive
announcements
about updates to the Sequin submission software, see the <a
href="Summary/email_lists.html">NCBI Announcements Email Lists</a>
page.</li></ul></td>
</tr>
</table>
<p></p>
<!-- ===========SPECIAL SUBMISSIONS TO GENBANK=========== -->
<a NAME="SpecialSubmissionsToGenBank"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Special Types of Submissions to GenBank</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF" ALIGN="CENTER">
<a href="#SubmitGenomes">Genomes</a>, &nbsp;
<a href="#SubmitAlignments">Alignments</a>, &nbsp;
<a href="#SubmitESTs">ESTs</a>, &nbsp;
<a href="#SubmitGSSs">GSSs</a>, &nbsp;
<a href="#HTG">HTGs</a>, &nbsp;
<a href="#SubmitSTSs">STSs</a>, &nbsp;
<a href="#WGS">WGS</a> <br>
<!-- (<i><a href="#SubmittingOtherTypesOfData">See also submissions to other
(non-GenBank) databases</a>: &nbsp;
<a href="#SubmitTPA">TPA</a>, &nbsp;
<a href="#SubmitVariation">Variation</a>, &nbsp;
<a href="#SubmitExpression">Expression</a>, &nbsp;
<a href="#SubmitMHC">MHC</a>, &nbsp;
<a href="#SubmitSKY_MFISH_CGH">SKY, MFISH & CGH</a>, &nbsp;
<a href="#SubmitTraceData">Traces</a></i>) -->
</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<a NAME="SubmitGenomes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><b>Submission of complete genomes and other large
sequence
records</b> - Recent enhancements to Sequin make it convenient for genome
sequencing
centers to annotate their records with
Sequin and submit the resulting ASN.1 file to GenBank. After the Sequin files
are
prepared, large genomes should be submitted by ftp; write to
genomes@ncbi.nlm.nih.gov to
obtain an ftp account. Smaller records less than 350 kb can be sent by email to
gb-sub@ncbi.nlm.nih.gov. <br><br>
More information about submitting genomes and other large sequence records is
provided on the following pages: <a href="/Genbank/submit.html#ref6">GenBank
submissions</a>, <a href="/Sequin/">Sequin</a>, <a
href="/Sequin/table.html">tabular
layout for submitting annotated features</a>, <a
href="/Genbank/genomesubmit.html">bacterial genome submission
guidelines</a>.
<!-- old url is <a href="/genomes/static/instructions.html">bacterial genome submission
guidelines</a> --><br><br>
In addition, sequencing centers can register a sequencing project with NCBI
prior to
the submission of any data. This can be done through a <a
href="http://www.ncbi.nih.gov/genomes/mpfsubmission.cgi">Genome project
submission
form</a>. For each registered project, NCBI will create a sequencing project
page
that describes the project, links out to genome-specific reosurces, and provides
a
focal point for the addition of links to NCBI resources such as Map Viewer and
genomic BLAST. Projects can be listed publicly or remain unlisted, and
sequences
may be held until publication (the default), released immediately, or made
available
for BLAST searches only. The form can also be used to set up an FTP site for
the
upload of data to NCBI, or to specify a URL to be used by NCBI for download of
project or sequence data. (See Fall 2003/Winter 2004 issue of <a
href="http://www.ncbi.nlm.nih.gov/About/newsletter.html">NCBI News</a> for more
information.)</li></ul></td>
</tr>
</table>
<a NAME="SubmitAlignments"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Genbank/submit.html#ref6">Alignments</a> -
submission of
aligned sequences from population, phylogenetic, or mutation studies. Several
sections of the <a href="/Sequin/sequin.hlp.html">Sequin Help Documentation</a>
also
include information on how to submit alignments, such as <a
href="/Sequin/sequin.hlp.html#SubmissionType">Submission Type</a>, <a
href="/Sequin/sequin.hlp.html#FASTA+GAPFormatforAlignedNucleotideSequences">FAST
A+GA
P Format for Aligned Nucleotide Sequences</a>, <a
href="/Sequin/sequin.hlp.html#PHYLIPFormatforAlignedNucleotideSequences">PHYLIP
Format for Aligned Nucleotide Sequences</a>, <a
href="/Sequin/sequin.hlp.html#NEXUSFormatforAlignedNucleotideSequences">NEXUS
Format
for Aligned Nucleotide Sequences</a>, <a
href="/Sequin/sequin.hlp.html#SourceModifiersforPHYLIPandNEXUS">Source Modifiers
for
PHYLIP and NEXUS</a>, <a
href="/Sequin/sequin.hlp.html#ImportingAlignedSetsofSegmentedSequences">Importin
g
Aligned Sets of Segmented Sequences</a>. Check the Sequin help documentation
for
other relevant sections. (See Fall 2003/Winter 2004 issue of <a
href="http://www.ncbi.nlm.nih.gov/About/newsletter.html">NCBI News</a> for an
example in the "Submissions Corner".)</li></ul></td>
</tr>
</table>
<a NAME="SubmitESTs"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/dbEST/how_to_submit.html">ESTs</a> -
expressed
sequence tags; short, single pass read cDNA (mRNA) sequences. Also includes
cDNA
sequences from differential display experiments and RACE
experiments.</li></ul></td>
</tr>
</table>
<a NAME="SubmitGSSs"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/dbGSS/how_to_submit.html">GSSs</a> - genome
survey sequences; short, single pass read genomic sequences, exon trapped
sequences,
cosmid/BAC/YAC ends, others.</li></ul></td>
</tr>
</table>
<a NAME="HTG"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/HTGS/subinfo.html">HTGs</a> - high throughput
genome sequences from large scale genome sequencing centers; unfinished (phase
0, 1,
2) and finished (phase 3) sequences. (Note that contigs assembled from draft
and
finished human HTG sequences are accessible from the Map Viewer, described <a
href="#HumanChromosomeMapViews">below</a>.) <!-- and contigs assembled from
finished
mouse HTG sequences are accessible from the <a
href="#MouseGenomeSequencing">Mouse
Genome Sequencing</a> page.) --></li></ul></td>
</tr>
</table>
<a NAME="SubmitSTSs"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/dbSTS/how_to_submit.html">STSs</a> - sequence
tagged sites; short sequences that are operationally unique in the genome, used
to
generate mapping reagents.</li></ul></td>
</tr>
</table>
<a NAME="WGS"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Genbank/wgs.html">WGS</a> - data from Whole
Genome Shotgun <a href="/projects/WGS/WGSprojectlist.cgi">(WGS) sequencing
projects</a> can be submitted to GenBank. The data can
contain annotations and an entire project is updated as sequencing progresses.
WGS
submissions are given accession numbers in the format of four letters followed
by
eight digits, e.g., XXXX00000000. The four letters are a stable project_ID,
which
does not change as the project is updated. The first two digits represent the
version number, which corresponds to a particular project update. The last six
digits represent an individual contig within the WGS project. For example, if a
project's assigned accession number is XXXX00000000, then that project's first
assembly version would be XXXX01000000, and the first contig of that version
would
be XXXX01000001. (<a href="/Genbank/wgs.html">more...</a>)<br>
The nucleotide data from WGS projects go into the appropriate organismal <a
href="samplerecord.html#GenBankDivisionB">GenBank Divisions</a> and the BLAST
wgs
database. The protein translations of annotated coding sequences go into the
BLAST
protein nr database. In addition, quality data from many WGS projects are
submitted
to the <a href="/Traces/home/?cmd=show&foverview&m=main&s=overview">Trace Archives</a> (<a
href="#TraceArchive">described</a>
in the ResourceGuide section on <a href="#Nucleotides">Nucleotide Sequence
Databases</a>).
</li></ul></td>
</tr>
</table>
<!-- ===========OTHER (NON-GENBANK) TYPES OF DATA SUBMISSIONS=========== -->
<a NAME="SubmittingOtherTypesOfData"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Other Types of Data Submissions<br>
<FONT color="CD5555">(<b>Other NCBI databases, separate from GenBank, to which
data
can be submitted</b>)</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="SubmitTPA"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Genbank/tpa.html">Third Party Annotations
(TPA)</a> - a database of experimentally supported annotations on assemblies of
sequences already present in DDBJ/EMBL/GenBank. Whereas DDBJ/EMBL/GenBank
contains
primary sequence data and corresponding annotations submitted by the
laboratories
that did the sequencing, the TPA database contains third-party assemblies of
primary
data with experimentally supported annotation that has been published in a
peer-reviewed scientific journal. Details about how to submit data, as well as
examples of what can and cannot be submitted to TPA, are provided on the <a
href="/Genbank/tpa.html">TPA</a> home page. Additional information about the
TPA
database is provided <a href="#TPA">below</a>.</li></ul></td>
</tr>
</table>
<a NAME="SubmitVariation"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/SNP/">Genetic Variation</a> - in humans and
other
organisms can be submitted to the <a href="/SNP/">NCBI Database of Single
Nucleotide
Polymorphisms (dbSNP)</a>. &nbsp;Although dbSNP is a separate database from
GenBank,
as noted above, SNP records include cross-references to GenBank records.
&nbsp;(<a
href="/SNP/get_html.cgi?whichHtml=how_to_submit"><b>submission
instructions</b></a>)
&nbsp;Additional information about dbSNP is <a
href="#dbSNP">below</a>.</li></ul></td>
</tr>
</table>
<a NAME="SubmitExpression"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/geo/">Gene Expression</a> - data can be
submitted
to <a href="/geo/">Gene Expression Omnibus (GEO)</a> &nbsp;(<a
href="/geo/info/overview.html#deposit"><b>submission instructions</b></a>).
&nbsp;Additional information about GEO is <a
href="#GEO">below</a>.</li></ul></td>
</tr>
</table>
<a NAME="SubmitMHC"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/mhc/">Major Histocompatibility Complex
(MHC)</a>
- DNA sequence and clinical data can be submitted to <a href="/mhc/">dbMHC</a>.
&nbsp; A <a
href="#dbMHC">brief description</a> of dbMHC is provided in the Molecular
Databases/Nucleotide Sequences section of this guide, and additional details are
available in the <a
href="/books/bv.fcgi?rid=handbook.chapter.ch11">NCBI
Handbook</a>.</li></ul></td>
</tr>
</table>
<a NAME="SubmitSKY_MFISH_CGH"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/sky/">Spectral Karyotyping (SKY), Multiplex
Fluorescence In Situ Hybridization (M-FISH), and Comparative Genomic
Hybridization
(CGH)</a> - data can be submitted to the <a href="/sky/">NCI and NCBI SKY/M-FISH
&
CGH Database</a> &nbsp;(<a
href="http://www.ncbi.nlm.nih.gov/sky/ccap_helper.cgi?tsc=0"><b>submission instructions</b></a>) &nbsp;Additional information
about
the NCI and NCBI SKY/M-FISH & CGH Database is <a
href="#SKY_CGH">below</a>.</li></ul></td>
</tr>
</table>
<a NAME="SubmitTraceData"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Traces/home/?cmd=show&f=overview&m=main&s=overview">Trace Data</a> - can be
submitted to the <a href="/Traces/home/?cmd=show&f=overview&m=main&s=overview">Trace Archive</a>, a repository of
the
raw sequence traces generated by large sequencing projects. &nbsp;(<a
href="/Traces/trace.fcgi?cmd=show&f=rfc&m=main&s=rfc"><b>submission
instructions</b></a>) &nbsp;Additional information about the Trace Archives is
<a
href="#TraceArchive">below</a>.</li></ul></td>
</tr>
</table>
<br>
<!-- ================COLLABORATION================ -->
<a NAME="Collaboration"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">International Nucleotide Sequence
Database Collaboration</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/projects/collab/">GenBank, DDBJ, EMBL</a> - Overview
of
collaborative projects and links to home pages. The GenBank, DDBJ (DNA Data
Bank of
Japan), and EMBL (European Molecular Biology Laboratory) databases share data on
a
daily basis and are therefore equivalent. The record formats and search systems
might differ among the databases, but the accession numbers, sequence data, and
annotations are the same in all of them. E.g., you can retrieve the record with
accession number U12345 from GenBank, DDBJ, or EMBL and it will contain the same
sequence data, references, etc. in all three databases.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/projects/collab/FT/index.html">DDBJ/EMBL/GenBank
Feature
Table</a> - feature table formats and standards used in the annotation of
sequence
records by the collaborating databases; makes possible sharing of data; includes
detailed appendices such as:
<li><a href="/projects/collab/FT/index.html#7.3">biological features reference
key</a> (<a href="/projects/collab/FT/index.html#7.3.2">alphabetical list</a>
also
available)</li>
<li><a href="/projects/collab/FT/index.html#7.4">feature qualifiers</a></li>
<li>IUPAC abbreviations for <a
href="/projects/collab/FT/index.html#7.5.1">nucleotides</a></li>
<li>IUPAC abbreviations for <a href="/projects/collab/FT/index.html#7.5.3">amino
acids</a></td></li>
</tr>
</table>
<p></p>
<!-- ================FTP GENBANK================ -->
<a NAME="FTPGenBank"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">FTP GenBank and Daily Updates</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genbank/">GenBank flat file
format</a> - see <a href="samplerecord.html">sample GenBank record</a> and
detailed
description in <a href=ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt>GenBank release
notes</a>; download most recent full release (described <a
href="#Overview">above</a>) and daily cumulative or non-cumulative update
files.</td>
</tr>
</table>
<a NAME="ASN1"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/ncbi-asn1/">ASN.1 format</a> -
Abstract Syntax Notation 1, an International Standards Organization (ISO) data
representation format; download most recent full release (described <a
href="#Overview">above</a>) and daily cumulative or non-cumulative update files.
&nbsp;(<a href="Summary/asn1.html">more on ASN.1</a>)</td>
</tr>
</table>
<a NAME="FASTA"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/blast/db/">FASTA format</a> -
definition line followed by sequence data only (<a
href="/BLAST/fasta.html">example</a>); see <a
href="ftp://ftp.ncbi.nih.gov/blast/db/README">readme</a> file for database
descriptions, including <b>nt.Z</b> (daily updated non-redundant BLAST
nucleotide
database, contains GenBank+EMBL+DDBJ+PDB sequences, but no EST, STS, GSS, or
HTGS
sequences), <b>nr.Z</b> (daily updated non-redundant proteins), <b>est.Z</b>,
<b>gss.Z</b>, <b>htg.Z</b>, <b>sts.Z</b>, and others.</td>
</tr>
</table>
<p></p>
<br>
<!-- ======================END_ABOUT_GENBANK======================== -->
<!-- ====================MOLECULAR DATABASES========================== -->
<a NAME="Databases"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" class="H3a">Molecular Databases</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/Database/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<table BORDER="0" WIDTH="98%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#Nucleotides">Nucleotide Sequences</a>, &nbsp;
<a href="#Proteins">Protein Sequences</a>, &nbsp;
<a href="#Structures">Structures</a>, &nbsp;
<a href="#Genes">Genes</a>, &nbsp;
<a href="#Expression">Expression</a>, &nbsp;
<a href="#Taxonomy">Taxonomy</a>
</td>
<td CLASS="TEXT" WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<br>
<!-- ================MOLECULAR DATABASES: NUCLEOTIDES================ -->
<a NAME="Nucleotides"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Nucleotide Sequence Databases</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="EntrezNucleotides"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Nucleotide">Entrez
Nucleotides</a> -
combines data from a number of source databases, including GenBank, RefSeq, TPA,
and
PDB. Data can be searched by accession number, author name, organism,
gene/protein
name, and a variety of other text terms. Additional information about Entrez <a
href="#Entrez">below</a>. For retrieval of large data sets, Batch Entrez
(described
<a href="#BatchEntrez">below</a>) is available.</td>
</tr>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Genbank/index.html">GenBank</a> - a database of
nucleotide
sequences from >160,000 organisms. Records that are annotated with coding region
(CDS) features also include amino acid translations. GenBank belongs to an
international collaboration of sequence databases (described <a
href="#Collaboration">above</a>), which also includes EMBL and DDBJ. A <a
href="samplerecord.html"><b>sample record</b></a>, which provides a detailed
description of each field in a GenBank record, is also available. A variety of
sequence records exist in GenBank, such as characterized genes that have been
well-studied and annotated, batch produced sequences (ESTs, GSSs, STSs), high
throughput genomic sequences, complete genomes, and more. Additional information
about GenBank is given in the <a href="#GenBank">GenBank Overview</a> section of
this guide.</td>
</tr>
</table>
<a NAME="RefSeq"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/RefSeq/">RefSeq</a> - NCBI database of Reference
Sequences. Curated, non-redundant set including genomic DNA contigs, mRNAs and
proteins for known genes, mRNAs and proteins for gene models, and entire
chromosomes. Accession numbers have the format of two letters, an underscore
bar,
and six digits. Nucleotide sequence records have accessions: NT_123456,
NM_123456,
NC_123456, NG_123456, XM_123456, XR_123456 (more info about <a
href="/RefSeq/key.html#accessions">accession numbers</a> and <a
href="/RefSeq/key.html#query">access</a>). Additional details about RefSeq are
provided in the <a
href="/books/bv.fcgi?rid=handbook.chapter.ch18">NCBI
Handbook</a>, which is available online in the <a
href="/entrez/query.fcgi?db=Books">Entrez Books</a> database.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><blockquote><a href="/projects/CCDS/">Consensus CoDing Sequence
(CCDS) Database</a> - The CCDS project is a collaborative effort to identify a
core set of <b>human protein coding regions</b> that are consistently annotated
and of high quality. The long term goal is to support convergence towards a
standard set of gene annotations on the human genome. The collaborators include
the
<a href="http://www.ncbi.nlm.nih.gov/">National Center for Biotechnology
Information</a> (NCBI, <a href="http://www.ncbi.nlm.nih.gov/mapview/">Map
Viewer</a>),
<a href="http://www.ebi.ac.uk/">European Bioinformatics Institute</a> (EBI, </a>
<a href="http://www.ensembl.org/">Ensembl</a>),
<a href="http://www.cbse.ucsc.edu/">University of California, Santa Cruz</a>
(UCSC, <a href="http://genome.ucsc.edu/cgi-bin/hgGateway">Genome Browser</a>), and
<a href="http://www.sanger.ac.uk/">Wellcome Trust Sanger Institute</a> (WTSI, <a
href="http://vega.sanger.ac.uk/">Vega</a>).
They identify the position of protein-coding regions of genes that are (1)
annotated consistently on the human genome by all of the participating centers and
(2) supported by transcript evidence, use of canonical splice sites, and other
quality assurance measures. Additional information about the curation, process
flow, and quality testing is available on the CCDS web site.</blockquote></td>
</tr>
</table>
<a NAME="TPA"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Genbank/TPA.html">Third Party Annotation (TPA)
database</a> - a database of experimentally supported annotations on assemblies
of
sequences already present in DDBJ/EMBL/GenBank. Whereas DDBJ/EMBL/GenBank
contains
primary sequence data and corresponding annotations submitted by the
laboratories
that did the sequencing, the TPA database contains third-party assemblies of
primary
data with experimentally supported annotation that has been published in a
peer-reviewed scientific journal. Details about how to submit data, as well as
examples of what can and cannot be submitted to TPA, are provided on the <a
href="/Genbank/tpa.html">TPA</a> home page.
<blockquote><i>Note:</i> &nbsp;Although TPA records are derived from
DDBJ/EMBL/GenBank, TPA is actually a <b>separate database</b>. Therefore, TPA
records are not present in the GenBank <b>FTP</b> files, but will be available
in
separate FTP files.<br><br>
The TPA database uses an <b>accession format</b> similar to GenBank records
(e.g.,
two letters followed by six digits) and is organized into similar
<b>divisions</b>.
(A list of <a href="samplerecord.html#GenBankDivisionB">GenBank divisions</a> is
given in the <a href="samplerecord.html">GenBank Sample Record</a>. Some
divisions,
such as EST, GSS, HTG and are present in GenBank but will not be present in
TPA.)<br><br>
TPA records can be <b>easily recognized</b> because the definition lines begin
with
the the letters "TPA", and they contain "Third Party Annotation; TPA" in the
Keywords field. This is illustrated in a <b>sample TPA record</b>, <a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Search&db=Nucleotide&ter
m=BK
000627[pacc]&doptcmdl=GenBank">BK000627</a>.<br><br>
TPA records can be <b>retrieved</b> from <a
href="/entrez/query.fcgi?db=Nucleotide">Entrez Nucleotides</a> (described <a
href="#EntrezNucleotides">above</a>). To only see data from TPA, use the
"Index"
mode to select "tpa" from the Properties search field, or simply add the command
<b>AND tpa[prop]</b> to your query.<br><br>
<!-- TPA records can be retrieved from <a
href="/entrez/query.fcgi?db=Nucleotide">Entrez Nucleotides</a> (described <a
href="#EntrezNucleotides">above</a>). To only see data from TPA, you can use
the
Limits option (select "Only from TPA"), or select "srcdb_tpa" in the Properties
search field, or simply add the command <b>AND srcdb_tpa[prop]</b> to your
query.<br><br -->
Details about how to <b>submit data</b>, as well as examples of what can and
cannot
be submitted to TPA, are provided on the <a href="/Genbank/TPA.html">TPA home
page</a>. An announcement and additional information about the TPA database is
provided in section 1.4.5, "Third-Party Annotation and Consensus Sequences
(TPA)" of
the <a href="ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt">GenBank 133.0 release
notes</a>.</blockquote>
</td>
</tr>
</table>
<a NAME="dbEST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/dbEST/index.html">dbEST</a> - database of expressed
sequence tags; short, single pass read cDNA (mRNA) sequences. Also includes
cDNA
sequences from differential display experiments and RACE experiments.<br>
<i>Note:</i> EST sequences are available from two sources: dbEST and the EST
division of GenBank. The sequences and accession numbers in both sources are the
same but the record formats differ. &nbsp;(<a
href="/dbEST/how_to_submit.html">data
submission instructions...</a>)</td>
</tr>
</table>
<a NAME="dbGSS"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/dbGSS/index.html">dbGSS</a> - database of genome
survey
sequences; short, single pass read genomic sequences, exon trapped sequences,
cosmid/BAC/YAC ends, others.<br> <i>Note:</i> GSS sequences are available from
two
sources: dbGSS and the GSS division of GenBank. The sequences and accession
numbers
in both sources are the same but the record formats differ. &nbsp;(<a
href="/dbGSS/how_to_submit.html">data submission instructions...</a>)</td>
</tr>
</table>
<a NAME="dbMHC"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mhc/">dbMHC</a> - Provides a platform where the human
leukocyte antigen (HLA) community can submit, edit, view, and exchange Major
Histocompatibility Complex (MHC) data. The MHC database is fully integrated with
other NCBI resources, as well as with the International Histocompatibility Working
Group (<a href="http://www.ihwg.org/">IHWG</a>) Web site, and provides links to
the IMmunoGeneTics HLA (<a href="http://www.ebi.ac.uk/imgt/hla/">IMGT/HLA</a>)
database. Additional details are available in the <a
href="http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=handbook.chapter.1776">NCBI
Handbook</a>. <!-- Provides a
platform for genetic and clinical data related to the human Major
Histocompatibility Complex (MHC) where the human leukocyte antigen (HLA) community
can submit, edit, view, exchange, and analyze MHC data. -->
</td>
</tr>
</table>
<a NAME="dbSNP"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/SNP/">dbSNP</a> - database of single nucleotide
polymorphisms, small-scale insertions/deletions, polymorphic repetitive
elements,
and microsatellite variation. &nbsp;dbSNP includes polymorphism data that is
experimentally derived, computationally derived, as well as hybrid data that is
determined by the alignment of an experimentally derived molecule to genomic
sequence data. &nbsp;Currently, dbSNP is comprised of 4 general classes of
submissions: (a) The SNP Consortium (TSC) - candidate SNPs identified by
sequencing
using either the reduced representation shotgun strategy or by alignment of
random
reads to genomic sequence; &nbsp;(b)
Overlaps - candidate SNPs were identified in sequence overlaps between
individual
BACs or PACs; &nbsp; (c) ESTs - SNPs identified in EST clusters, including those
identified by the Cancer Genome Anatomy Project (described <a
href="#CGAP">below</a>); &nbsp;(d) Other - SNPs identified after screening
larger
numbers of chromosomes include many with alleles of lower frequency (1%-20%).
&nbsp;(<a href="/SNP/get_html.cgi?whichHtml=how_to_submit">data submission
instructions</a>) &nbsp;&nbsp;To receive announcements about updates and new
features to dbSNP, see the <a href="Summary/email_lists.html">NCBI Announcements
Email Lists</a>
page.<br>
<i>Note:</i> Although dbSNP is a separate database from GenBank, SNP records
include
cross-references to GenBank records. &nbsp;</td>
</tr>
<a NAME="dbSTS"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/dbSTS/index.html">dbSTS</a> - database of sequence
tagged
sites; short sequences that are operationally unique in the genome, used to
generate
mapping reagents.<br> <i>Note:</i> STS sequences are available from two
sources:
dbSTS and the STS division of GenBank. The sequences and accession numbers in
both
sources are the same but the record formats differ. &nbsp;(<a
href="/dbSTS/how_to_submit.html">data submission instructions...</a>)</td></td>
</tr>
</table>
<a NAME="UniSTS"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/unists/">UniSTS</a> - a unified, non-redundant
view
of sequence tagged sites (STSs). UniSTS integrates marker and mapping data from
a
variety of public resources. If two or more markers have different names but
the
same primer pair, a single STS record is presented for the primer pair and all
the
marker names are shown. Each UniSTS record displays the primer sequences,
product
size, mapping information, and cross references to Entrez Gene, dbSNP, RHdb, GDB,
MGD,
and the Map Viewer. The marker report also lists GenBank and RefSeq records that
contain the primer sequences, as determined by <a href="#ePCR">Electronic PCR
(e-PCR)</a>. Data sources include dbSTS, RHdb, GDB, various human maps
(Genethon
genetic map, Marshfield genetic map, Whitehead RH map, Whitehead YAC map,
Stanford
RH map, NHGRI chr 7 physical map, WashU chrX physical map), various mouse maps
(Whitehead RH map, Whitehead YAC map, Jackson laboratory's MGD map).</td>
</tr>
</table>
<p></p>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">Complete Genomes</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li>see the <a href="#Genomes">Genomes and Maps</a> section
below; includes resources for
<a href="#MultipleOrganisms">multiple organisms (<i>Entrez Genome</i>)</a>,
<a href="#HumanGenome">human</a>,
<a href="#MouseGenome">mouse</a>,
<a href="#RatGenome">rat</a>,
<a href="#ZebrafishGenome">zebrafish</a>,
<a href="#DrosophilaGenome"><i>Drosophila</i></a>,
<a href="#NematodeGenome">nematode</a>,
<a href="#PlantGenomes">plant genomes</a>,
<a href="#YeastGenome">yeast</a>,
<a href="#MalariaGenome">malaria</a>,
<a href="#MicrobialGenomes">microbial genomes</a>,
<a href="#ViralGenomes">viruses</a>,
<a href="#ViroidGenomes">viroids</a>,
<a href="#Plasmids">plasmids</a>,
<a href="#EukaryoticOrganelles">eukaryotic organelles</a>
</ul></td>
</tr>
</table -->
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/">UniGene</a> - ESTs and full-length mRNA
sequences organized into clusters that each represent a unique known or putative
gene within the organism from which the sequences were obtained. UniGene
clusters
are annotated with mapping and expression information when possible (e.g., for
human), and include cross-references to other resources. Sequence data can be
downloaded by cluster through the UniGene web pages, or the complete data set
can be
downloaded from the <a
href="ftp://ftp.ncbi.nih.gov/repository/UniGene/">repository/UniGene</a>
directory
of the FTP site. In addition, <b>UniGene DDD</b> (described <a
href="#UniGeneDDD">below</a>) can be used to show differential expression of
genes
between cDNA libraries. The <b>organisms represented</b> in UniGene are listed
on
the <a href="/UniGene/">UniGene home page</a>.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms in order to identify
putative orthologs. Curated orthologs are incorporated from a variety of
sources
via Entrez Gene. Organisms represented are listed on the HomoloGene home
page.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="http://mgc.nci.nih.gov/">Mammalian Gene Collection
(MGC)</a> - The NIH Mammalian Gene Collection (MGC) is a trans-NIH initiative
that
seeks to identify and sequence a representative full open reading frame (FL-ORF)
clone for each human, mouse, and rat gene. The MGC project entails the
production
of cDNA libraries and sequences, database and repository development, as well as
the
support of research for improved library construction, sequencing, and analytic
technologies. All the resources generated by the MGC are publicly accessible to
the
biomedical research community.</td>
</tr>
</table>
<a NAME="TraceArchive"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Traces/home/?cmd=show&f=overview&m=main&s=overview">Trace Archives</a> - a repository
of
the raw sequence traces generated by large sequencing projects. It allows
retrieval
of both the sequence file and the underlying data which generated the file. In
the
case of projects that rely on a Whole Genome Shotgun (WGS) strategy, the Trace
Archive will be the sole source of raw sequence data. (More information about
WGS projects
is provided in the ResourceGuide section on <a
href="#SpecialSubmissionsToGenBank">special
types of submissions to GenBank</a>/<a href="#WGS">WGS</a>.)
NCBI will be exchanging data regularly with the
<a href="http://trace.ensembl.org/">Ensembl Trace Server</a>.
The Trace Archive can be searched by using <a
href="http://blast.ncbi.nlm.nih.gov/BLAST.cgi?PAGE=Nucleotide&PROGRAM=blastn&BLAST_PROGRAMS=megaBLAST&PAGE_TYPE=BlastSearch">
Trace BLAST</a>
(described <a
href="#TraceBLAST">below</a>), or by entering a term in the search
box at
the top of the Trace Archives Page. (<a
href="/Traces/trace.fcgi?cmd=show&f=rfc&m=main&s=rfc">data submission
instructions...</a>)</td>
</tr>
</table>
<a NAME="ShortReadArchive"></a>
<table BORDER="0" CELLSPACING="5" width="90%" BGCOLOR="#FFFFFF">
<tr>
<td class="text"><blockquote><a href="/Traces/sra/sra.cgi?cmd=show&f=main&m=main&s=main">Short Read Archive</a> -
houses sequencing data generated by new sequencing platforms.</blockquote></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><blockquote><a href="/Traces/assembly/assmbrowser.cgi">Assembly
Archive</a> - links the raw sequence information found in the
<a href="/Traces/home/">Trace Archives</a> with assembly information
found in publicly available sequence repositories (<a
href="/Genbank/index.html">GenBank/EMBL/DDBJ</a>).
The Assembly Viewer allows a user to see the multiple sequence alignments as
well as
the actual sequence chromatogram.</blockquote></td>
</tr>
</table>
<a NAME="UniVec"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/VecScreen/UniVec.html">UniVec</a> - a database that
can
be used to quickly identify segments within nucleic acid sequences which may be
of
vector origin. Screening using UniVec is efficient because a large number of
redundant sub-sequences have been eliminated to create a database that contains
only
one copy of every unique sequence segment from a large number of vectors. The
<b>VecScreen</b> tool, described <a href="#VecScreen">below</a> (under sequence
analysis tools), can be used to compare a query sequence against the UniVec
database
in order to identify possible <a href="/VecScreen/contam.html">vector
contamination</a>.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">Genomes - Resources in the <a href="#Genomes">Genomes and
Maps</a>
section contain the nucleotide sequences for a variety of genomes. Examples of
the
genomes available include: &nbsp;
<a href="#MultipleOrganisms">>1000 organisms in <i>Entrez Genome</i></a>,
<a href="#HumanGenome">human</a>,
<a href="#MouseGenome">mouse</a>,
<a href="#RatGenome">rat</a>,
<a href="#ZebrafishGenome">zebrafish</a>,
<a href="#DrosophilaGenome"><i>Drosophila</i></a>,
<a href="#NematodeGenome">nematode</a>,
<a href="#PlantGenomes">plant genomes</a>,
<a href="#YeastGenome">yeast</a>,
<a href="#MalariaGenome">malaria</a>,
<a href="#MicrobialGenomes">microbial genomes</a>,
<a href="#ViralGenomes">viruses</a>,
<a href="#ViroidGenomes">viroids</a>,
<a href="#Plasmids">plasmids</a>,
<a href="#EukaryoticOrganelles">eukaryotic organelles</a>.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="#NucleotideSequenceAnalysis">Nucleotide Sequence
Analysis</a> - various tools are available for analyzing nucleotide sequences
and
are described <a href="#NucleotideSequenceAnalysis">below</a>.</td>
</tr>
</table>
<p></p>
<!-- ================MOLECULAR DATABASES: PROTEINS================ -->
<a NAME="Proteins"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Protein Sequence Databases</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Protein">Entrez Proteins</a> -
search protein sequence records (from GenPept + RefSeq + Swiss-Prot + PIR + RPF
+
PDB) by accession number, author name, organism, gene/protein name, and a
variety of
other text terms. Additional information about Entrez <a
href="#Entrez">below</a>.
For retrieval of large data sets, <b>Batch Entrez</b> (described <a
href="#BatchEntrez">below</a>) is available. Entrez proteins also includes
<b>BLink</b> ("BLAST Link"), a feature which displays the results of BLAST
searches
that have been done for every protein sequence in the Entrez Proteins data
domain.
To access it, follow the BLink link displayed beside any hit in the results of
an
Entrez Proteins search. More information about BLink is provided <a
href="#BLink">below</a>.</td>
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/RefSeq/">RefSeq</a> - NCBI database of Reference
Sequences. Curated, non-redundant set including genomic DNA contigs, mRNAs and
proteins for known genes, mRNAs and proteins for gene models, and entire
chromosomes. Accession numbers have the format of two letters, an underscore
bar,
and six digits. Protein sequence records have accessions: NP_123456 or
XP_123456
(more info about <a href="/RefSeq/key.html#accessions">accession numbers</a> and
<a
href="/RefSeq/RSfaq.html#access">access</a>).</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genbank/">FTP GenPept</a> -
download the "relxxx.fsa_aa.gz" file. The filename stands for "Release number
XXX
FASTA formatted amino acid translations". The translations are extracted from
GenBank/EMBL/DDBJ records that are annotated with one or more CDS features</td>
</tr>
</table>
<a NAME="CDD"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/cdd/cdd.shtml">Conserved Domain Database
(CDD)</a> - a collection of
sequence alignments and profiles representing protein domains
conserved in molecular evolution. It includes domains
from <a href="http://smart.embl-heidelberg.de/">Smart</a> and
<a href="http://pfam.wustl.edu/">Pfam</a>, as well as domains
contributed by NCBI researchers. It also includes alignments
of the domains to known 3-dimensional protein structures in the
MMDB database (described <a href="#MMDB">below</a>).
CDD can be used to identify conserved domains in a protein query
sequence, using the <b>CD-Search</b> service (described
<a href="#CD-Search">below</a>). In addition, the <b>CDART</b> tool
(described <a href="#CDART">below</a>) uses CDD and RPS-BLAST (described <a
href="#RPS-BLAST">below</a>) to retrieve proteins with similar domain
architectures.
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/RefSeq/HIVInteractions/">HIV Interactions</a> - The
HIV-1, Human Protein Interaction Database contains information about known
interactions of HIV-1 proteins with proteins from human hosts. It provides
annotated bibliograhies of published reports of protein interactions, with links
to
the corresponding PubMed records and sequence data. <a
href="#HIVInteractions">More
information</a> about this database is provided under "Literature Databases".
</td>
</tr>
</table>
<a NAME="PROW"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/prow/">PROW</a> - Protein Resources on the Web -
short
authoritative guides on the approximately 200 human CD cell-surface molecules.
Peer-reviewed; provides approximately 20 standardized categories of information
(biochemical function, ligands, etc.) for each CD antigen.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="#ProteinSequenceAnalysis">Protein Sequence
Analysis</a> -
various tools are available for analyzing protein sequences and are described <a
href="#ProteinSequenceAnalysis">below</a>.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">Proteomes</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li>Resources in the <a href="#Genomes">Genomes and
Maps</a>
section contain annotations for the proteins encoded by a variety of genomes.
Examples of the genomes available include: &nbsp;
<a href="#MultipleOrganisms">>1000 organisms in <i>Entrez Genome</i></a>,
<a href="#HumanGenome">human</a>,
<a href="#MouseGenome">mouse</a>,
<a href="#RatGenome">rat</a>,
<a href="#ZebrafishGenome">zebrafish</a>,
<a href="#DrosophilaGenome"><i>Drosophila</i></a>,
<a href="#NematodeGenome">nematode</a>,
<a href="#PlantGenomes">plant genomes</a>,
<a href="#YeastGenome">yeast</a>,
<a href="#MalariaGenome">malaria</a>,
<a href="#MicrobialGenomes">microbial genomes</a>,
<a href="#ViralGenomes">viruses</a>,
<a href="#ViroidGenomes">viroids</a>,
<a href="#Plasmids">plasmids</a>,
<a href="#EukaryoticOrganelles">eukaryotic organelles</a>.
The proteomes of human, mouse, rat, and a growing number of other eukaryotes are
also shown as annotations on the genomes of those organisms in <b>Map Viewer</b>
(<a
href="#MapViewer">described</a> in the <a href="#Genomes">Genomes and Maps</a>
section). MapViewer can display the data graphically or in tabular format, and
also
provides links to corresponding data on the FTP site.
</ul>
</td>
</tr>
</table>
<a NAME="ProtTaxTable"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=Genome">Entrez
Genome</a> -
provides ProtTable and TaxTable for various organisms. The <b>ProtTable</b>
provides a summary of protein coding regions in a genome, and provides links to
the
corresponding nucleotide and protein sequences in FASTA format. The
<b>TaxTable</b>, also referred to as the "<b>distribution of BLAST protein
homologs
by taxa</b>," summarizes the results of BLAST analyses done for the proteins,
and
displays the relationship of the organism to others through a color-coded
graphical
summary. (Additional information about Entrez Genome is provided <a
href="#EntrezGenome">below</a>.)</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/genbank/genomes/">FTP
Genome Proteins</a> - download an *.faa file (FASTA formatted amino acid
sequences)
and *ptt file (protein table) for various organisms from the genbank/genomes
directory of the ftp site; see <a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/README">readme</a> file for more
information. Protein tables can also be viewed in Entrez Genome, as noted
above.</li></ul></td>
</tr>
</table>
<p></p>
<!-- ================MOLECULAR DATABASES: STRUCTURES================ -->
<a NAME="Structures"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Structure Databases</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/">Structure Home</a> - general information
about the NCBI Structure Group and its research projects, as well as access to
the
Molecular Modeling Database (MMDB) and related tools to search and display
structures.</td>
</tr>
</table>
<a NAME="MMDB"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/MMDB/mmdb.shtml">MMDB: Molecular Modeling
Database</a>- a database of three-dimensional biomolecular structures derived
from
X-ray crystallography and NMR-spectroscopy. MMDB is a subset of
three-dimensional
structures obtained from the Brookhaven Protein DataBank (PDB), excluding
theoretical models. MMDB reorganizes and validates the information in a way
that
enables cross-referencing between the chemistry and the three-dimensional
structure
of macromolecules. Its data specification includes a description of a
biopolymer's
spatial structure, a description of how it is organized chemically, and a set of
pointers linking the two. By integrating chemical, sequence, and structure
information, MMDB is designed to serve as a resource for structure-based
homology
modeling and protein structure prediction. MMDB records are stored in <a
href="#ASN1">ASN.1</a> format and can be displayed with the <a
href="#Cn3D">Cn3D</a>, Rasmol, or Kinemage viewers. In addition, similar
structures
within the database have been identified using<a href="#VAST">VAST</a>, and new
structures can be compared against the database using <a
href="#VASTSearch">VASTsearch</a>.</td>
</tr>
</table>
<a NAME="3D_Domains"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Domains">3D Domains Database</a>
-
compact structural domains identified automatically in MMDB, Entrez's
macromolecular
three-dimensional structure database. These domains are identified by searching
for
breakpoints in the structure between major secondary structure elements so that
the
ratio of intra- to inter-domain contacts falls above a set threshhold. 3D
Domains
are the units of comparison for structure neighbor ("related structures")
calculations using the VAST algorithm.
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/cdd/cdd.shtml">Conserved Domain Database
(CDD)</a> - a collection of
sequence alignments and profiles representing protein domains
conserved in molecular evolution. It includes domains
from <a href="http://smart.embl-heidelberg.de/">Smart</a> and
<a href="http://pfam.wustl.edu/">Pfam</a>, as well as domains
contributed by NCBI researchers. It also includes alignments
of the domains to known 3-dimensional protein structures in the
MMDB database (described <a href="#MMDB">above</a>).
CDD can be used to identify conserved domains in a protein query
sequence, using the <b>CD-Search</b> service (described
<a href="#CD-Search">below</a>). In addition, the <b>CDART</b> tool
(described <a href="#CDART">below</a>) uses CDD and RPS-BLAST (described <a
href="#RPS-BLAST">below</a>) to retrieve proteins with similar domain
architectures.
</td>
</tr>
</table>
<a NAME="PubChem"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://pubchem.ncbi.nlm.nih.gov/">PubChem</a> -
contains
the chemical structures of small organic molecules and information on their
biological activities.
It is intended to support the Molecular Libraries and Imaging component of the
<a
href="http://nihroadmap.nih.gov/">NIH Roadmap Initiative</a>.
PubChem's chemical structure database may be searched on the basis of
descriptive
terms, chemical properties, and structural similarity.
When possible, PubChem's chemical structure records are linked to other NCBI
databases, including the <a href="/entrez/query.fcgi?db=PubMed">PubMed</a>
scientific literature database and NCBI's <a
href="/entrez/query.fcgi?db=Structure">protein 3D structure database</a>.
PubChem also contains the results of high-throughput biological screening
experiments. PubChem is organized as three linked databases within the
<a href="/Entrez/">Entrez/PubMed</a> information retrieval system. <!--
Background
information about the project is provided in a summer 2004 article in the <a
href="http://www.nih.gov/catalyst/2004/04.05.01/page1a.html">NIH Catalyst</a>.
-->
</td>
</tr>
</table>
<a NAME="PubChemSubstance"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pcsubstance">PubChem
Substance</a> - Primary data NCBI obtains from the various public depositories.
The PubChem Substance database contains approximately <!-- database statistic -->13 million
records as of October 2006, provided by various sources, DTP/NCI, NIAID, ChemIDplus,
NIST, NIST webbook, MOLI/NCI, ChemBank, MMDB, KEGG, and more.
Substance information includes chemical structures, synonyms,
registration IDs, descriptions, related urls, and database cross-reference links
to PubMed, protein 3D structures, and biological screening results.</li></ul></td>
<!-- OLD DESCRIPTION, THROUGH 10/18/06:
td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pcsubstance">PubChem
Substance</a> - Primary data NCBI obtains from the various public depositories.
The substance database currently contains approximately <database statistic>18 million
records (as of October 2006) provided by various sources, DTP/NCI, NIAID, ChemIDplus,
NIST, NIST webbook, MOLI/NCI, ChemBank, MMDB, KEGG, and more.
Substance information includes chemical structures, synonyms,
vregistration IDs, description, related urls, database
cross-reference links, etc.</li></ul></td -->
</tr>
</table>
<a NAME="PubChemCompound"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pccompound">PubChem
Compound</a> - A database made by NCBI and derived from PCSubstance.
It is a non-redundant view of the chemically validated substances in PubChem Substance.
There is one PubChem Compound record for each unique substance, and for each
unique substance component. There can be multiple PubChem Substance records
associated with one PubChem Compound record.
PubChem Compound contains all standardized structures, mixture components,
and precalculated structure neighboring links.
Compound information includes structure, compound property
information (molecular weight, formula, xLogP, count of the
rotatable bonds, H bond donor, H bond acceptor, etc.), and
structure description (SMILES, IUPAC name, INCHI).
</li></ul></td>
<!-- OLD DESCRIPTION, THROUGH 10/18/06:
td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pccompound">PubChem
Compound</a> - A database made by NCBI and derived from PCSubstance.
It includes all the defined chemical components in PCSubstance.
It contains all standardized structures, mixture components,
and precalculated structure neighboring links.
Compound information includes structure, compound property
information (molecular weight, formula, xLogP, count of the
rotatable bonds, H bond donor, H bond acceptor, etc.), and
structure description (SMILES, IUPAC name, INCHI.).
</li></ul></td -->
</tr>
</table>
<a NAME="PubChemBioAssay"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pcassay">PubChem
BioAssay</a> - The assay database consists of deposited bioactivity data
and descriptions of bioactivity assays used for screening of the chemical
substances contained in PubChem Substance, including descriptions of the
conditions and the readouts (bioactivity levels) specific to the screening procedure.
The assay database includes DTP/NCI's 710 million lines of in vitro
and in vivo data covering from cancer, HIV, to many other fields.
</li></ul></td>
<!-- OLD DESCRIPTION, THROUGH 10/18/06:
td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=pcassay">PubChem
BioAssay</a> - The assay database consists of deposited bioactivity data,
and assay description. The current assay database contains
DTP/NCI's 710 million lines of in vitro and in vivo data
covering from cancer, HIV, to many other fields.
</li></ul></td -->
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">Structure-Related Tools - in addition to the structure databases
described above, NCBI offers several tools:
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Structure/CN3D/cn3d.shtml">Cn3D</a> - "See in
3-D," a structure and sequence alignment viewer for NCBI databases. It allows
viewing of 3-D structures and sequence-structure or structure-structure
alignments.
Cn3D can work as a helper application to your browser, or as a client-server
application that retrieves structure records from MMDB (described <a
href="#MMDB">above</a>) directly over the internet. The <a
href="/Structure/CN3D/cn3d.shtml">Cn3D home page</a> provides access to
information
on how to <a href="/Structure/CN3D/cn3dinstall.shtml">install</a> the program, a
<a
href="/Structure/CN3D/cn3dtut.shtml">tutorial</a> to get started, and a
comprehensive <a href="/Structure/CN3D/cn3dhelp.shtml">help
document</a>.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Structure/cdd/wrpsb.cgi">CD-Search</a> -
The Conserved Domain Search Service (CD-Search) can be used to identify
the conserved domains present in a protein sequence. CD-Search
uses RPS-BLAST (described <a href="#RPS-BLAST">above</a>) to compare
a query sequence against position-specific score matrices that
have been prepared from conserved domain alignments present in
the Conserved Domain Database (CDD) (described <a href="#CDD">above</a>).
Hits can be displayed as a pairwise alignment of the query sequence
with a representative domain sequence, or as a multiple alignment.
Alignments are also mapped to known 3-dimensional structures,
and can be displayed using Cn3D (described <a href="#Cn3D">above</a>).
In the Cn3D display, residues in sequence alignments are variously colored,
based on their degree of conservation.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Structure/VAST/vast.html">VAST</a> - Vector
Alignment Search Tool - a computer algorithm developed at NCBI and used to
identify
similar protein 3-dimensional structures. The "structure neighbors" for every
structure in MMDB are pre-computed and accessible via links on the MMDB
Structure
Summary pages. These neighbors can be used to identify distant homologs that
cannot
be recognized by sequence comparison alone.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Structure/VAST/vastsearch.html">VAST
Search</a> -
structure-structure similarity search service. Compares 3D coordinates of a
newly
determined protein structure to those in the MMDB/PDB database. VAST Search
computes
a list of structure neighbors that you may browse interactively, viewing
superpositions and alignments by molecular graphics.</ul> </td>
</tr>
</table>
<p></p>
<!-- ================MOLECULAR DATABASES: GENES================ -->
<a NAME="Genes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Genes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="EntrezGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - Entrez
Gene provides a gene-based view of the data from a wide range of genomes. It
supplies key connections in the nexus of map, sequence, expression, structure,
functional, and homology data. Each record represents a single gene from a given
organism. The minimum set of data in a gene record includes a unique identifier
or GeneID assigned by NCBI, a preferred symbol, and any of sequence information,
map information, or official nomenclature from an authority list. In addition, a
gene record can also include expression, structure, functional, and homology data,
when available. Entrez Gene includes data from all organisms that have RefSeq
genome records (with NC_* accessions, see more info <a href="#RefSeq">above</a>),
and can also include data from recognized genome-specific databases that provide
NCBI with information about genes (preferably with defining sequence) or mapped
phenotypes. Entrez Gene is the successor to LocusLink (described <a
href="#LocusLink">below</a>).</td>
</tr>
</table>
<a NAME="GeneRIF"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><blockquote><a
href="/projects/GeneRIF/GeneRIFhelp.html">GeneRIF</a> -
Gene References into Function (GeneRIFs) provide a simple mechanism to allow
scientists to add to the functional annotation of loci described in <a
href="/entrez/query.fcgi?db=gene">Entrez Gene</a>. They appear as annotated
bibliographies in Entrez Gene records, and consist of brief statements on gene
function with links to the corresponding PubMed records (<a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&dopt=Gr
aphi
cs&list_uids=4292">example: human MLH1</a>). The <a
href="/projects/GeneRIF/GeneRIFhelp.html">GeneRIF help page</a> describes the
simple steps
needed to submit information. GeneRIFs are also added to the Entrez Gene
records by
the MEDLINE Indexing Staff of the National Library of Medicine. GeneRIFs are
currently available for a subset of organisms in Entrez Gene, and will be
provided
for the loci of other organisms as the development of Entrez Gene
continues.</blockquote></td>
</tr>
</table>
<a NAME="LocusLink"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><blockquote><a href="/LocusLink/">LocusLink</a> - <b>LocusLink
was discontinued as of March 1, 2005.</b> It provided a foundation for what is
now Entrez Gene and was described in several articles (<a
href="/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11125071&dopt=Abstract">
Pruitt KD, Maglott DR (2001)</a>, <a
href="/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10637631&dopt=Abstract">
Pruitt KD, Katz KS, Sicotte H, Maglott DR (2000)</a>). It contained data for a
number of species such as human, mouse, rat, zebrafish, nematode, fruit fly, cow,
sea urchin, African clawed frog, HIV-1, and a few other model and commonly studied
organisms.
Data for these organisms (and from the ongoing collaboration among the groups
listed above) are now available in the <a
href="/entrez/query.fcgi?db=gene">Entrez Gene</a> database (described <a
href="#EntrezGene"">above</a>), which is the successor to LocusLink.
The <b>major differences between LocusLink and Entrez Gene</b> are scope of data
and
search interface. Entrez Gene contains data from all organisms with RefSeq genome
records. (RefSeq is <a href="#RefSeq">described </a> in the Molecular
Databases/Nucleotide Sequences section of this guide). Entrez Gene also uses the
Entrez search system, and therefore offers the helpful functions such as
Preview/Index, History, and LinkOut that are available for other Entrez databases.
The <a href="/entrez/query/static/help/genehelp.html">Entrez Gene help
document</a> includes numerous tips for previous users of LocusLink.
</blockquote>
</td>
</tr>
</table>
<a NAME="CCDS"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/projects/CCDS/">Consensus CoDing Sequence (CCDS)
Database</a> - The CCDS project is a collaborative effort to identify a core set
of <b>human protein coding regions</b> that are consistently annotated and of high
quality. The long term goal is to support convergence towards a standard set of
gene annotations on the human genome. The collaborators include the
<a href="http://www.ncbi.nlm.nih.gov/">National Center for Biotechnology
Information</a> (NCBI, <a href="http://www.ncbi.nlm.nih.gov/mapview/">Map
Viewer</a>),
<a href="http://www.ebi.ac.uk/">European Bioinformatics Institute</a> (EBI, </a>
<a href="http://www.ensembl.org/">Ensembl</a>),
<a href="http://www.cbse.ucsc.edu/">University of California, Santa Cruz</a>
(UCSC, <a href="http://genome.ucsc.edu/cgi-bin/hgGateway">Genome Browser</a>), and
<a href="http://www.sanger.ac.uk/">Wellcome Trust Sanger Institute</a> (WTSI, <a
href="http://vega.sanger.ac.uk/">Vega</a>).
They identify the position of protein-coding regions of genes that are (1)
annotated consistently on the human genome by all of the participating centers and
(2) supported by transcript evidence, use of canonical splice sites, and other
quality assurance measures. Additional information about the curation, process
flow, and quality testing is available on the CCDS web site.</td>
</tr>
</table>
<a NAME="UniGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/">UniGene</a> - ESTs and full-length mRNA
sequences organized into clusters that each represent a unique known or putative
gene within the organism from which the sequences were obtained. UniGene
clusters
are annotated with mapping and expression information when possible (e.g., for
human), and include cross-references to other resources. Sequence data can be
downloaded by cluster through the UniGene web pages, or the complete data set
can be
downloaded from the <a
href="ftp://ftp.ncbi.nih.gov/repository/UniGene/">repository/UniGene</a>
directory
of the FTP site. In addition, <b>UniGene DDD</b> (described <a
href="#UniGeneDDD">below</a>) can be used to show differential expression of
genes
between cDNA libraries. The <b>organisms represented</b> in UniGene are listed
on
the <a href="/UniGene/">UniGene home page</a>.</td>
</tr>
</table>
<a NAME="HomoloGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms in order to identify
putative orthologs. Curated orthologs are incorporated from a variety of
sources
via Entrez Gene. Organisms represented are listed on the HomoloGene home
page.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://mgc.nci.nih.gov/">Mammalian Gene Collection
(MGC)</a> - The NIH Mammalian Gene Collection (MGC) is a trans-NIH initiative
that
seeks to identify and sequence a representative full open reading frame (FL-ORF)
clone for each human, mouse, and rat gene. The MGC project entails the
production
of cDNA libraries and sequences, database and repository development, as well as
the
support of research for improved library construction, sequencing, and analytic
technologies. All the resources generated by the MGC are publicly accessible to
the
biomedical research community.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/RefSeq/HIVInteractions/">HIV Interactions</a> - The
HIV-1, Human Protein Interaction Database contains information about known
interactions of HIV-1 proteins with proteins from human hosts. It provides
annotated bibliograhies of published reports of protein interactions, with links
to
the corresponding PubMed records and sequence data. <a
href="#HIVInteractions">More
information</a> about this database is provided under "Literature Databases".
</td>
</tr>
</table>
<a NAME="AceView"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/IEB/Research/Acembly/">AceView (Acembly)</a> - AceView
offers an integrated view of the human, nematode and Arabidopsis genes
reconstructed by co-alignment of all publicly available mRNAs and ESTs on the
genome sequence. The goals are to offer a reliable up-to-date resource on the
genes and their functions and to stimulate further validating experiments at the
bench. AceView carefully computes co-alignment and clustering of experimental
cDNA sequences, no prediction is involved. The resulting AceView genes and their
alternative variants are analyzed in terms of expression, intron-exon structure,
alternative features, regulation and neighbor relationships; the protein products
are analyzed for completeness, their best covering clones are identified, the
proteins are searched for motifs, membership to a protein family, conservation in
evolution, closest homologues in other species and signals for subcellular
localization. The genes are presented in the context of biological annotations
gathered from various sources. AceView can be queried by meaningful words or
sentences as well as by most standard identifiers.</td>
</tr>
</table>
<br>
<!-- ================MOLECULAR DATABASES: EXPRESSION================ -->
<a NAME="Expression"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Expression</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="GEO"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/geo/">Gene Expression Omnibus (GEO)</a> - a gene
expression and hybridization array data repository, as well as a curated, online
resource for gene expression data browsing, query and retrieval. GEO was the
first
fully public high-throughput gene expression data repository, and became
operational
in July 2000. Many types of gene expression data from platforms such as spotted
microarray (microarray), high-density oligonucleotide array (HDA), hybridization
filter (filter) and serial analysis of gene expression (SAGE) data, are
accepted,
accessioned, and archived as a public data set. GEO data can be accessed
through
several search and browsing tools on the <a href="/geo/">GEO home page</a>, <a
href="/Entrez/">Entrez</a> (via <a href="/entrez/query.fcgi?db=geo">Entrez GEO
Profiles</a> and <a href="/entrez/query.fcgi?db=gds">Entrez GDS (GEO
DataSets)</a>),
and the <a href="ftp://ftp.ncbi.nih.gov/pub/geo/">FTP site</a>. The Tools/Gene
Expression section of this file provides information about <a
href="#GeneExpressionTools">data visualization and exploration capabilities</a>
available in GEO.<br></td>
</tr>
</table>
<a NAME="GENSAT"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gensat">GENSAT</a> - The Gene
Expression Nervous System Atlas, or GENSAT, project aims to map the expression
of
genes in the central nervous system of the mouse, using both in situ
hybridization
and transgenic mouse techniques. The GENSAT database contains a series of
images
related to gene expression experiments. The images are indexed on a number of
fields
relevant to biological discovery. Search criteria include gene names, gene
symbols,
gene aliases and synonyms, mouse ages, and imaging protocols. The GENSAT
project is
a collaboration among the <a href="http://www.ninds.nih.gov/">National Institute
of
Neurological Disorders and Stroke (NINDS)</a>, <a
href="http://www.gensat.org/index.html">Rockefeller University</a>, <a
href="http://www.stjudebgem.org/web/mainPage/mainPage.php">St. Jude Children's
Research Hospital</a>, and <a href="http://www.ncbi.nlm.nih.gov/">NCBI</a>.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><FONT CLASS="Text">Expression-Related Tools</FONT> - in
addition to
the GEO database, described above, NCBI offers several tools:
</td>
</tr>
</table>
<a NAME="SAGEmap"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/SAGE/">SAGEmap</a> - Serial Analysis of Gene
Expression, or SAGE, is an experimental technique designed to quantitatively
measure
gene expression. SAGEmap is an online tool to compare computed gene expression
profiles between SAGE libraries generated by the Cancer Genome Anatomy Project
(<b>CGAP</b>, <a href="#CGAP">described</a> under <a
href="#CancerResearch">human
genome/cancer research</a>) and submitted by others through the Gene Expression
Omnibus (<b>GEO</b>, described <a href="#GEO">above</a>). SAGEmap also includes
a
comprehensive analysis of SAGE tags in human GenBank records, in which a UniGene
identifier is assigned to each human sequence that contains a SAGE tag. Data can
be
retrieved by tag, by sequence, by UniGene cluster ID and by library name. When
retrieving data by sequence or UniGene cluster ID, follow a SAGE tag's hotlink
to
find out its expression level in different SAGE libraries, and how it is
represented
in the rest of the sequences in GenBank. Retrieving data by library name takes
one
to GEO, where all SAGEmap data has been stored by library. Analytical tools
include
<a href="/SAGE/index.cgi?cmd=expsetup">xProfiler</a>, which compares gene
expression
between SAGE libraries of your choice as well as uploaded data. More
information
about the additional analytical capabilities of the SAGEmap resource is provided
in
the <a href="#GeneExpressionTools">tools/gene expression</a> section of this
file.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/ncicgap/">CGAP</a> - Cancer Genome Anatomy
Project - interdisciplinary program to identify the human genes expressed in
different cancerous states, based on cDNA (EST) libraries, and to determine the
molecular profiles of normal, precancerous, and malignant cells. Collaboration
among the National Cancer Institute, the NCBI, and numerous research labs.
Additional information about CGAP is provided in the <a
href="#GeneExpressionTools">tools/gene expression</a> section of this file.
Related
resources are described in the <a href="#CancerResearch">human genome/cancer
research</a> section.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/UniGene/info_ddd.shtml">UniGene DDD</a> -
Digital
Differential Display - an online tool to compare computed gene expression
profiles
between selected cDNA libraries. Using a statistical test, genes whose
expression
levels differ significantly from one tissue to the next are identified and shown
to
the user. <a href="#UniGene">Additional information</a> about UniGene is in the
<a
href="#Genes">molecular databases/genes</a> section.</li></ul></td>
</tr>
</table>
<br>
<!-- ================MOLECULAR DATABASES: TAXONOMY================ -->
<a NAME="Taxonomy"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Taxonomy</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Taxonomy/">NCBI Taxonomy Database Home</a> - general
information about the Taxonomy project, including taxonomic resources and a list
of
outside curators collaborating with NCBI taxonomists. The NCBI Taxonomy
Database
contains the names and lineages of >160,000 organisms, both living and extinct,
that
are represented in the genetic databases with at least one nucleotide or protein
sequence. New organisms are added to the database as sequence data are
deposited
for them. The purpose of the taxonomy project at NCBI is to build a consistent
phylogenetic taxonomy for the sequence databases.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/Taxonomy/">Taxonomy Browser</a> - The search bar on
the
Taxonomy home page allows you to browse the NCBI taxonomy database. Enter the
scientific or common name of a species (e.g., <i>Canis familiaris</i> or dog) or
a
higher taxon (e.g., Canidae) to view that organism or taxon's lineage; retrieve
the
available nucleotide, protein, structure, and genome records; and browse up and
down
the taxonomic tree. (<i>Tip</i>: &nbsp; For the broadest search results, select
the
"token set" option in the search bar, which searches for any string, whether in
the
beginning, middle, or end of a word.) &nbsp;<a
href="/entrez/query.fcgi?db=Taxonomy">Entrez</a> also provides an interface for
browsing the taxonomy database, and offers features such as the <a
href="/Taxonomy/CommonTree/wwwcmt.cgi">Common Tree</a> function, which allows
you to
build a tree for your own selection of organisms or taxa (<a
href="/Taxonomy/CommonTree/cmthelp.html">more...</a>).</td>
</tr>
</table>
<a NAME="TaxonomyBLASTb">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/blast/taxblasthelp.html">Taxonomy BLAST</a> - an
implementation of Gapped BLAST (2.x) that groups hits by source organism,
according
to information in NCBI's Taxonomy database. Species are listed in order of
sequence
similarity to the query sequence; the strongest match listed first. Three report
views are available:
<ul>
<li><i>organism report</i> - sorts the BLAST hits according to species, so that
all
of the hits to the same organism will appear together
<li><i>lineage report</i> - gives a simplified view of the relationships between
the
organisms, according to their classification in the taxonomy database. This
report
is "focused" on the organism which yielded the strongest BLAST hit. It answers
the
question, "how closely are the organisms in the BLAST hit list related to the
query
sequence according to the taxonomy database?"
<li><i>taxonomy report</i> - provides a more detailed report about the
relationships among all of the organisms found in the BLAST hit list, including
a
summary of the taxa that are represented, the number of species and subspecies,
and
the number of BLAST hits at each node in the taxonomic hierarchy.
</ul>
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/sutils/taxik2.cgi">TaxPlot</a> - a tool for 3-way
comparisons of genomes on the basis of the protein sequences they encode. To use
TaxPlot, one selects a reference genome to which two other genomes are compared.
Pre-computed BLAST results are then used to plot a point for each predicted
protein
in the reference genome, based on the best alignment with proteins in each of
the
two genomes being compared.</td>
</tr>
</table>
<p></p>
<!-- ==========================END_DATABASES======================== -->
<!-- ===========================LITERATURE========================== -->
<a NAME="Literature"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" class="H3a">Literature Databases</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/Literature/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<a NAME="PubMed"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/">PubMed</a> - A database of citations and
abstracts for biomedical literature. These citations are from MEDLINE and
additional life science journals. PubMed also includes links to many sites
providing
full text articles and other related resources. PubMed is accessible through
the <a
href="/Entrez/">Entrez</a> search and retrieval system (<a
href="#Entrez">described
below</a>)</td>
</tr>
</table>
<a NAME="Journals"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=journals">Journals
Database</a> - allows you to lookup journals that are cited in any of the Entrez
databases, including PubMed. Journals can be searched using the journal title,
MEDLINE or ISO abbreviation, ISSN, or the NLM Catalog ID.</li></ul></td>
</tr>
</table>
<a NAME="MeSH"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=mesh">MeSH</a> - The
Medical
Subject Headings (MeSH) database is NLM's controlled vocabulary used for
indexing
articles for MEDLINE/PubMed. MeSH terminology provides a consistent way to
retrieve
information that may use different terminology for the same
concepts.</li></ul></td>
</tr>
</table>
<a NAME="CitationMatcher"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">
<ul><li><a href="/entrez/query/static/overview.html#Citation Matcher">Citation
Matcher</a> - allows you to find the PubMed ID of any article in the PubMed
database, given its bibliographic information (journal, volume, page,
etc.).</li>
<ul>
<li><a href="/entrez/query/static/citmatch.html">Citation Matcher for single
articles</a></i></li>
<li><a href="/entrez/getids.cgi">Batch Citation Matcher for many
articles</a></i></li>
<!-- li><FONT color="0033CC">E-Mail Citation Matcher</FONT> is also
available,
and can be used for or one or many articles. To obtain the help documentation,
send
the word HELP in the body of a message to the server address: <a
href="mailto:citation_matcher@ncbi.nlm.nih.gov">citation_matcher@ncbi.nlm.nih.go
v</a
></i></li -->
</ul>
</ul>
</td>
</tr>
</table>
<a NAME="PubRef"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/entrez/query/static/help/pmhelp.html#Subsets">PubRef</a>
- a database of bibliographic records from a broad range of scientific journals,
and
links (when available) to full text on publisher web sites. PubRef includes
PubMed,
plus publisher supplied citations and abstracts from journals of other
scientific
disciplines. It is therefore is a superset of PubMed, and can be searched
through
the Entrez/PubMed system. The <a
href="/entrez/query/static/help/pmhelp.html#Subsets">PubMed Help document</a>
provides information on how to select PubRef for searching.</td>
</tr>
</table -->
<a NAME="PubMedCentral"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://www.pubmedcentral.nih.gov/">PubMed Central</a>
- a digital archive of biomedical and life sciences journal literature, including
clinical medicine and public health, managed by the National Center for
Biotechnology Information (NCBI) at the U.S. National Library of Medicine (NLM).
It is not a journal publisher. Access to PubMed Central (PMC) is free and unrestricted.</td>
</tr>
</table>
<a NAME="OMIM"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=OMIM">OMIM - Online <i>Mendelian
Inheritance in Man</i></a> - continuously updated catalog of human genes and
genetic
disorders, with links to associated literature references, sequence records,
maps,
and related databases.</td>
</tr>
</table>
<a NAME="Books"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=books">Entrez Books</a> - In
collaboration with book publishers, the NCBI is adapting textbooks for the web
and
linking them to PubMed, the biomedical bibliographic database. The idea is to
provide background information to PubMed, so that users can explore unfamiliar
concepts found in PubMed search results.</td>
</tr>
</table>
<a NAME="HIVInteractions"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/RefSeq/HIVInteractions/">HIV Interactions</a> - The
HIV-1, Human Protein Interaction Database contains information about known
interactions of HIV-1 proteins with proteins from human hosts. <a
href="/RefSeq/">RefSeq</a> protein sequence records serve as anchors for
collecting
published information about interactions between HIV-1 and human proteins. Each
HIV
Interactions database record lists an HIV protein and the human proteins with
which
it has been found to interact. In turn, the Entrez Gene record
for
each human protein contains annotated HIV-1 Interactions bibliographies, which
consist of brief statements on protein interactions with links to the
corresponding
PubMed records and sequence data. The HIV Interactions database is a
collaborative
project among the developers of <a href="/RefSeq/">RefSeq</a> (<a
href="#RefSeq">description</a>) and <a href="/entrez/query.fcgi?db=gene">Entrez
Gene</a> (<a href="#EntrezGene">description</a>), and is similar in concept to
<a
href="/projects/GeneRIF/GeneRIFhelp.html">GeneRIF</a> (<a
href="#GeneRIF">description</a>).
In contrast to GeneRIFs for single genes, however, the publications cited in
the
HIV Interactions Database contain statements about binding between two proteins
rather than statements about the function of a single gene.</td>
</tr>
</table>
<p></p>
<!-- ============================END_LITERATURE======================= -->
<!-- ======================GENOMES_AND_MAPS======================== -->
<a NAME="Genomes"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" CLASS="H3a">Genomes and Maps</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a" valign="top"><a
href="/Genomes/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td CLASS="TEXT" WIDTH="88%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#MultipleOrganisms">organism collections</a> (including
<a href="#EntrezGenome">Entrez Genome</a>,
<a href="#EntrezGenomeProject">Entrez Genome Project</a>,
<a href="#MapViewer">Map Viewer</a>,
<a href="#GenomesEntrezGene">Entrez Gene</a>,
<a href="#GenomesUniGene">UniGene</a>,
<a href="#GenomesHomoloGene">HomoloGene</a>, and
<a href="#COGs">COGs</a>), &nbsp; and organism-specific resources, such as:
<a href="#HumanGenome">human</a>, &nbsp;
<a href="#MouseGenome">mouse</a>, &nbsp;
<a href="#RatGenome">rat</a>, &nbsp;
<a href="#ZebrafishGenome">zebrafish</a>, &nbsp;
<a href="#DrosophilaGenome"><i>Drosophila</i></a>, &nbsp;
<a href="#NematodeGenome">nematode</a>, &nbsp;
<a href="#PlantGenomes">plant genomes</a>, &nbsp;
<a href="#YeastGenome">yeast</a>, &nbsp;
<a href="#MalariaGenome">malaria</a>, &nbsp;
<a href="#MicrobialGenomes">microbial genomes</a>, &nbsp;
<a href="#ViralGenomes">viruses</a>, &nbsp;
<a href="#ViroidGenomes">viroids</a>, &nbsp;
<a href="#Plasmids">plasmids</a>, &nbsp;
<a href="#EukaryoticOrganelles">eukaryotic organelles</a>
</blockquote>
</td>
<td CLASS="TEXT" WIDTH="12%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<p></p>
<!-- =========CATEGORY WITHIN GENOMES_AND_MAPS: MULTIPLE ORGANISMS======= -->
<a NAME="MultipleOrganisms"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Organism Collections</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="GenomicBiology"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Genomes/">Genomic Biology</a> -
An introduction to the field of genomic biology, with links to the genome
resources pages for major organisms and organism groups, as well as links to
additional NCBI genome resources.</td>
</tr>
</table>
<a NAME="EntrezGenome"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Genome">Entrez Genome</a> -
sequence and map data from the whole
genomes of over 1000 organisms. The genomes represent both completely sequenced
organisms and those for which sequencing is in progress. All three main domains
of
life - <a
HREF="/genomes/static/eub_g.html">bacteria</a>,
<a
HREF="/genomes/static/a_g.html">archaea,</a>
and <a
HREF="/genomes/static/euk_g.html">eukaryota</a>
- are represented, as well as many <a
HREF="/genomes/VIRUSES/viruses.html">viruses</a>,
<a
HREF="/genomes/static/phg.html">phages</a>,
<a
HREF="/genomes/static/vid.html">viroids</a>,
<a
HREF="/genomes/static/o.html">plasmids</a>,
and <a
HREF="/genomes/ORGANELLES/organelles.html">organelles.</a>. Entrez Genome
provides
graphical overviews of complete genomes/chromosomes, and the ability to explore
regions of interest in progressively greater detail. <a
href="#ProtTaxTable">ProtTables and TaxTables</a> are provided for organisms on
which analyses have been done by NCBI staff. In addition, the <a
href="/mapview/">Map Viewer</a>, a software component of Entrez Genome, provides
views of integrated chromosome maps for a variety of organisms (see additional
information about the Map Viewer <a href="#MapViewer">below</a>).
<blockquote>Information about submitting genome data from complete genomes is
provided in the Resource Guide section on <a href="#SubmitGenomes">Submission of
complete genomes</a>. After data from complete genomes are submitted, they are
made available in Entrez Genome (as complete genomes or chromosomes) and Entrez
Nucleotide (as chromosome or genome fragments such as contigs). Entrez
Nucleotide also provides access to the records for complete genomes/chromosomes,
but the default view of those records is the Nucleotide database is GenBank
format, whereas the default view in Entrez Genome is a graphical overview. A
companion database, Entrez Genome Project, is <a
href="#EntrezGenomeProject">described</a> below.</blockquote></td>
</tr>
</table>
<a NAME="EntrezGenomeProject"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?DB=genomeprj">Entrez Genome
Project</a> -
a companion database to Entrez Genome (<a href="#EntrezGenome">described</a>
above). The actual data from genome sequencing projects are contained in Entrez
Genome (as complete genomes chromosomes) and Entrez Nucleotide (as chromosome or
genome fragments such as contigs). The Genome Project database, on the other
hand, provides an umbrella view of the status of each genome project, links to
project data in the other Entrez databases, and links to a variety of other NCBI
and external resources associated with a given genome project. A genome
project's status can be complete or in-progress, and the project can include
large-scale sequencing, assembly, annotation, and mapping efforts. New genome
sequencing projects can be registered through the <a
href="/genomes/mpfsubmission.cgi">Genome project submission form</a>. More
information about the submission of data from complete genomes is provided in
the Resource Guide section on <a href="#SubmitGenomes">Submission of complete
genomes</a>. (Although the Entrez Genome Project database does not include
viral genome sequencing projects, data from those projects are submitted to
GenBank and are available in the Entrez Nucleotide and Entrez Genome databases.
There is also a special set of resources at NCBI dedicated to <a
href="#ViralGenomes">Viral Genomes</a>.)</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mailman/listinfo/genomes-announce">Genomes
Announcements</a> - To receive announcements about recently completed genomes,
see
the <a href="Summary/email_lists.html">NCBI Announcements Email Lists</a>
page.</td>
</tr>
</table>
<a NAME="MapViewer"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/">Map Viewer</a> - The Map Viewer is a
software
component of Entrez Genome (described <a href="#EntrezGenome">above</a>) that
provides special browsing capabilities for a subset of organisms. It allows you
to
view and search an organism's complete genome, display chromosome maps, and zoom
into progressively greater levels of detail, down to the sequence data for a
region
of interest. If multiple maps are available for a chromosome, it displays them
aligned to each other based on shared marker and gene names, and, for the
sequence
maps, based on a common sequence coordinate system. The organisms currently
represented in the Map Viewer are listed on the Map Viewer home page and in the
<a
href="/mapview/static/MapViewerHelp.html">Map Viewer help document</a>, which
provides general information on how to use that tool. The number and types of
available maps vary by organism, and are described in the "data and search tips"
file provided for each organism.</td>
</tr>
</table>
<a NAME="GenomesEntrezGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - Entrez
Gene
provides a gene-based view of the data from a wide range of genomes. It
supplies
key connections in the nexus of map, sequence, expression, structure,
functional,
and homology data. Each record represents a single gene from a given organism.
The
minimum set of data in a gene record includes a unique identifier or GeneID
assigned
by NCBI, a preferred symbol, and any of sequence information, map information,
or
official nomenclature from an authority list. In addition, a gene record can
also
include expression, structure, functional, and homology data, when available.
Entrez Gene includes data from all organisms that have RefSeq genome records
(with
NC_* accessions, see more info <a href="#RefSeq">above</a>), and can also
include
data from recognized genome-specific databases that provide NCBI with
information
about genes (preferably with defining sequence) or mapped phenotypes.</td>
</tr>
</table>
<a NAME="GenomesUniGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/">UniGene</a> - ESTs and full-length mRNA
sequences organized into clusters that each represent a unique known or putative
gene within the organism from which the sequences were obtained. UniGene
clusters
are annotated with mapping and expression information when possible (e.g., for
human), and include cross-references to other resources. Sequence data can be
downloaded by cluster through the UniGene web pages, or the complete data set
can be
downloaded from the <a
href="ftp://ftp.ncbi.nih.gov/repository/UniGene/">repository/UniGene</a>
directory
of the FTP site. In addition, UniGene DDD (described <a
href="#UniGeneDDD">below</a>) can be used to show differential expression of
genes
between cDNA libraries. The organisms represented in UniGene are listed on the
<a
href="/UniGene/">UniGene home page</a>.</td>
</tr>
</table>
<a NAME="GenomesHomoloGene"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms in order to identify
putative orthologs. Curated orthologs are incorporated from a variety of
sources
via Entrez Gene. Organisms represented are listed on the HomoloGene home
page.</td>
</tr>
</table>
<a NAME="COGs"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/COG/">COGs - Clusters of Orthologous Groups</a> -
natural
system of gene families from complete genomes. Clusters of Orthologous Groups
(COGs) were delineated by comparing protein sequences encoded in complete
unicellular genomes representing 30 major phylogenetic lineages. Each COG
consists
of individual proteins or groups of paralogs from at least 3 lineages and thus
corresponds to an ancient conserved domain. The <a href="/COG/old/">Initial
Version</a> of COGs includes 44 organisms. The <a href="/COG/new/">Updated
Version</a> of COGs includes 66 organisms in the <a
href="/COG/new/release/phylox.cgi">Unicellular Clusters</a>, plus <a
href="/COG/new/shokog.cgi">Eukaryotic Clusters</a> (called KOGs). More
organisms
will be added in the future.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Entrez/Genome/org.html">Download Genomes <350 KB</a>
via
Entrez Genome pages for individual organisms</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genbank/genomes/">Download
Genomes
>350 KB from the NCBI ftp site</a> - see FTP information <a
href="#FTP_HumanGenome">below</a>; ftp links are also available from Entrez
Genome
pages for individual organisms</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/genomes/static/links.html">Genome Sequencing
Centers</a>
- list of genome sequencing centers and the organisms on which they work</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS TABLE: HUMAN======== -->
<a NAME="HumanGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Human Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#HumanGenomeGuide">Guide</a>, &nbsp;
<a href="#HumanChromosomes">Chromosomes</a>, &nbsp;
<a href="#HumanSequences">Sequences</a>, &nbsp;
<a href="#HumanGenes">Genes</a>, &nbsp;
<a href="#HumanGenomeBLAST">BLAST</a>, &nbsp;
<a href="#Clones">Clones</a>, &nbsp;
<a href="#HumanMaps">Genome Maps</a>, &nbsp;
<a href="#HumanMappedMarkers">Mapped Markers</a>, &nbsp;
<a href="#HumanCytogenetics">Cytogenetics</a>, &nbsp;
<a href="#HumanGeneExpression">Gene Expression</a>, &nbsp;
<a href="#HumanGeneticVariation">Genetic Variation</a>, &nbsp;
<a href="#HumanDisorders">Disorders</a>, &nbsp;
<a href="#CancerResearch">Cancer Research</a>, &nbsp;
<a href="#HumanGenomeFTP">FTP</a>
</blockquote>
</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: GUIDE======== -->
<a NAME="HumanGenomeGuide"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Guide</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/guide/">Human Genome Resources
Guide</a> -
overview of available human genome data resources. Includes bulletins and
progress
reports concerning the Human Genome Project and
provides centralized access to previously disparate data.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/About/Doc/hs_genomeintro.html">Introduction
to
NCBI's Genome Resource</a> - overview of the nature of data generated by the
human
genome project, the processes use to assemble and annotate the data, and to
integrate it with a wide range of information from other
resources.</li></ul></td>
</tr>
</table>
<a NAME="HumanContigAssembly"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/guide/build.html">NCBI Contig Assembly
and
Annotation Process</a> - describes the processes use to assemble contigs from
the
high throughput genomic sequences (HTGs, described <a href="#HTG">above</a>),
and to
annotate the contigs with features. It also describes the various resources
that
can be used to access the human genome data.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Tour/">Tour of the Draft Human Genome
Sequence</a> - provides an introduction to how the
draft sequence of the human genome can be used by biologists, and includes
examples
of the types of questions that can be answered with the data.</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: CHROMOSOMES======== -->
<a NAME="HumanChromosomes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Chromosomes</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="HumanChromosomeMapViews"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/mapview/map_search.cgi?taxid=9606">Map
Viewer</a>
- <b>integrated views of chromosome maps</b> - The Map Viewer (described <a
href="#MapViewer">above</a>) displays one or more maps which have been aligned
to
each other based on shared marker and gene names, and, for the sequence maps,
based
on a common sequence coordinate system. For human, the Map Viewer includes >20
sequence, cytogenetic, genetic linkage, radiation hybrid, and other maps.
&nbsp;(When viewing a chromosome, use the "Maps & Options" dialog box to display
the
map(s) of interest.) &nbsp;&nbsp;The sequence maps are based on the contigs
built
from the draft and finished sequence data generated by the Human Genome Project.
A
<a href="/mapview/static/humansearch.html">list of available human maps</a> and
their descriptions is provided. The <a
href="/mapview/static/MapViewerHelp.html">Map Viewer help document</a> provides
general information on how to use that tool. Information about the <a
href="/genome/guide/build.html">NCBI Contig Assembly and Annotation Process</a>
is
also available.<p></p></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="#HumanGenomeFTP">FTP human chromosome data</a>
-
see the FTP information, <a href="#HumanGenomeFTP">below</a></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: SEQUENCES======== -->
<a NAME="HumanSequences"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Sequences</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="HumanGenomeSequencing"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/seq/">Human Genome Sequencing</a> -
sequencing progress of the Human Genome Project; links to chromosome views in
the
<b>Map Viewer</b> (described <a href="#HumanChromosomeMapViews">above</a>),
including <a href="/mapview/static/humansearch.html#SequenceMaps">sequence
maps</a>
assembled from contigs that were constructed (<a
href="/genome/guide/build.html">more...</a>) from international sequencing
center
data; list of <a href="/genome/seq/HsCenters.html">genome sequencing
centers</a>.</font></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/RefSeq/">RefSeq</a> - NCBI database of
Reference
Sequences. Curated, non-redundant set including genomic DNA contigs, mRNAs and
proteins for known genes, mRNAs and proteins for gene models, and entire
chromosomes. Accession numbers have the format of two letters, an underscore
bar,
and six digits, for example, NC_123456, NT_123456, NM_123456, NP_123456 (more
info
about <a href="/RefSeq/key.html#accession">accession numbers</a> and <a
href="/RefSeq/RSfaq.html#access">access</a>).</li></ul></td>
</tr>
</table>
<a NAME="EntrezHuman"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Entrez/">Entrez</a> - provides integrated
access
to nucleotide and protein sequence data in GenBank, EMBL, DDBJ, RefSeq,
PIR-International, PRF, Swiss-Prot, and PDB, along with
3D protein structures, genomic mapping information, and PubMed MEDLINE. Entrez
contains pre-computed similarity searches for each database record, producing a
list of related sequences, structures, and MEDLINE records. Includes sequence
data
from >160,000 species; use the organism field to limit searches to human
records.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/UniGene/Hs.Home.html">UniGene</a> - ESTs and
full-length mRNA sequences organized into clusters that each represent a unique
known or putative gene within the organism from which the sequences were
obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/dbEST/index.html">dbEST</a> - Database of
Expressed Sequence Tags - short (about 300-500 bp) cDNA sequences representing
single-pass reads from mRNA. Usually produced in large numbers and represent a
snapshot of the genes expressed in a given tissue, and/or at a given
developmental
stage. Also includes ESTs generated by the <a href="#CGAP">CGAP</a> project
(see <a
href="#CancerResearch">Cancer Research</a>, below), and sequences from
differential
display and RACE experiments.</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: GENES======== -->
<a NAME="HumanGenes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Genes</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/projects/CCDS/">Consensus CoDing Sequence
(CCDS)
Database</a> - The CCDS project is a collaborative effort to identify a core set
of <b>human protein coding regions</b> that are consistently annotated and of high
quality. The long term goal is to support convergence towards a standard set of
gene annotations on the human genome. The collaborators include the
<a href="http://www.ncbi.nlm.nih.gov/">National Center for Biotechnology
Information</a> (NCBI, <a href="http://www.ncbi.nlm.nih.gov/mapview/">Map
Viewer</a>),
<a href="http://www.ebi.ac.uk/">European Bioinformatics Institute</a> (EBI, </a>
<a href="http://www.ensembl.org/">Ensembl</a>),
<a href="http://www.cbse.ucsc.edu/">University of California, Santa Cruz</a>
(UCSC, <a href="http://genome.ucsc.edu/cgi-bin/hgGateway">Genome Browser</a>), and
<a href="http://www.sanger.ac.uk/">Wellcome Trust Sanger Institute</a> (WTSI, <a
href="http://vega.sanger.ac.uk/">Vega</a>).
They identify the position of protein-coding regions of genes that are (1)
annotated consistently on the human genome by all of the participating centers and
(2) supported by transcript evidence, use of canonical splice sites, and other
quality assurance measures. Additional information about the curation, process
flow, and quality testing is available on the CCDS web site.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> -
a gene-based view of the data from a wide range of genomes, including human. It
supplies key connections in the nexus of map, sequence, expression, structure,
functional, and homology data. <a href="#EntrezGene">More information about
Entrez Gene</a> is provided above, in the Molecular Databases/Genes section.
</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=OMIM">OMIM</a> - Online
Mendelian Inheritance in Man - continuously updated catalog of human genes and
genetic disorders, with links to associated literature references, sequence
records,
maps, and related databases.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/RefSeq/">RefSeq</a> - NCBI database of
Reference
Sequences. Curated, non-redundant set including genomic DNA contigs, mRNAs and
proteins for known genes, mRNAs and proteins for gene models, and entire
chromosomes. Accession numbers have the format of two letters, an underscore
bar,
and six digits, for example, NC_123456, NT_123456, NM_123456, NP_123456 (more
info
about <a href="/RefSeq/key.html#accession">accession numbers</a> and <a
href="/RefSeq/RSfaq.html#access">access</a>).</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/UniGene/Hs.Home.html">UniGene</a> - ESTs and
full-length mRNA sequences organized into clusters that each represent a unique
known or putative gene within the organism from which the sequences were
obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/HomoloGene/">HomoloGene</a> - a gene homology
tool that compares nucleotide sequences between pairs of organisms, including
human,
mouse, rat, zebrafish, and fruit fly, in order to identify putative orthologs.
Curated orthologs are incorporated from a variety of sources via
Entrez Gene.</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: BLAST======== -->
<a NAME="HumanGenomeBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">BLAST against human genomic sequence data</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/seq/HsBlast.html">BLAST against the
draft
human genome sequence</a> - Compare your nucleotide or protein query sequence to
the
working draft sequence of the human genome, its mRNA and protein products, or
the
other data sets described below. The genome sequence been assembled from
GenBank
sequence records (primarily HTGs) using the process described in <a
href="/genome/guide/build.html">NCBI Contig Assembly and Annotation Process</a>.
The contigs assembled by this process have been given NT_* accession numbers as
part
of the RefSeq project (described <a href="#RefSeq">above</a>).
<p>A <a href="/genome/seq/Database.html">variety of database choices</a> are
provided on the <a href="/genome/seq/HsBlast.html">Human Genome BLAST
page</a>.</p>
</ul></td>
</tr>
<!-- tr>
<td CLASS="TEXT"><ul><li><a href="/BLAST/">BLAST against human ESTs</a> -
compare a
nucleotide or protein sequence against the human ESTs by choosing
<b>est_human</b>
as the database to be searched when using Nucleotide BLAST or Translated
BLAST.</li></ul></td>
</tr -->
<!-- tr>
<td CLASS="TEXT"><ul><li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&HITLIST_SIZE=100&NCB
I_GI
=on&PAGE=Nucleotides&PROGRAM=blastn&SERVICE=plain&SET_DEFAULTS.x=34&SET_DEFAULTS
.y=8
&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">BLAST against gss database</a> - compare a
nucleotide sequence against random "single pass read" <b>genome survey
sequences</b>
such as cosmid/BAC/YAC end sequences,
exon trapped genomic sequences, and Alu PCR sequences. Although the gss
database
contains sequences from many organisms, you can limit search results to
human.</li></ul></td>
</tr -->
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: CLONES======== -->
<a NAME="Clones"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Clones</FONT><img SRC="spacer10.GIF" width="440"
height="1" border="0"></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT2">
<blockquote><i>NCBI does not distribute clones. However, some NCBI resources
contain information about clones and the sources from which they can be
obtained.</i></blockquote>
</td>
</tr>
</table>
<a NAME="CloneMaps"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li>Clone Maps -
various clone maps for human have been included in the <b>Map Viewer</b>,
described
<a href="#MapViewerHuman">below</a>. The document that describes the various <a
href="/mapview/static/humansearch.html">maps available for human</a> includes a
section listing the <a href="/mapview/static/humansearch.html#ObjectTypes">maps
that
contain clone information</a>. To select those maps for display, use the
Maps&Options dialog box when viewing any human chromosome. (Several other <a
href="/mapview/static/MapViewerHelp.html#OrganismList">organisms accessible</a>
through the <a href="/mapview/static/MVstart.html">Map Viewer</a> are
represented by
maps that contain clone information. The organism-specific "data and search
tips"
files provide additional detail about the maps available for each organism.)<!--
A
<a href="/mapview/static/humansearch.html">list of available maps</a> is
provided in
the associated help document, including those that show the positions of <a
href="/mapview/static/humansearch.html#ObjectTypeClone">clones</a>. When
viewing a
chromosome, use the "Maps & Options" dialog box to select the map(s) of
interest.
-->
</ul></td>
</tr>
</table>
<a NAME="CloneRegistry"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/clone/">Clone Registry</a> - a
database
used by participating <a href="/genome/seq/HsCenters.html">human genome
sequencing
centers</a> and <a href="/genome/seq/MmSeqCenters.html">mouse genome sequencing
centers</a> to record which clones have been selected for sequencing, which are
currently in the pipeline, and which are finished and represented by sequence
entries in GenBank. Includes BACs, PACs, cosmids, fosmids. Uses <a
href="/genome/clone/nomenclature.shtml">standardized clone names</a> that
represents
a clone's microtitre plate address (plate number, row, and column) prefixed by a
library abbreviation, to produce unique names. Includes <a
href="/genome/clone/ordering.html">clone ordering
information</a>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/cyto/hbrc.shtml">Human BAC
Resource</a> -
A cytogenetic resource of large-insert,
FISH-mapped clones containing sequence-tagged sites. Will help
integrate cytogenetic, radiation-hybrid, linkage, and sequence maps of
the human genome. Includes links to clone distributors.</li></ul></td>
</tr>
</table>
<a NAME="MGC"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="http://mgc.nci.nih.gov/">Mammalian Gene
Collection
(MGC)</a> - The NIH Mammalian Gene Collection (MGC) is a trans-NIH initiative
that
seeks to identify and sequence a representative full open reading frame (FL-ORF)
clone for each human, mouse, and rat gene. The MGC project entails the
production
of cDNA libraries and sequences, database and repository development, as well as
the
support of research for improved library construction, sequencing, and analytic
technologies. All the resources generated by the MGC are publicly accessible to
the
biomedical research community.</li></ul></td>
</tr>
</table>
<a NAME="ClonesNonHuman"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><blockquote><i><b>Clone Information for Other (Non-human)
Organisms</b> - Some organisms have additional clone information resources. For
example, the resources available for the <a href="#MouseGenome"><b>mouse
genome</b></a> include several items mentioned above, plus a <b>CloneFinder</b>,
described <a href="#MouseGenomeCloneFinder">below</a>. In addition, many
records in
dbEST (described <a href="#dbEST">above</a>) include information about clone
sources
such as the <a href="http://image.llnl.gov/">I.M.A.G.E.
consortium</a>.</i></blockquote></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: MAPS======== -->
<a NAME="HumanMaps"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Genome Maps</FONT><img SRC="spacer10.GIF" width="440"
height="1" border="0"></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="/entrez/query.fcgi?cmd=Search&db=Genome&term=human[orgn]&dispmax=50&doptcmdl
=Summary">Entrez Genome</a> - links to the human chromosome views in
the
Map Viewer (details <a href="#MapViewerHuman">below</a>). Entrez Genome also
includes a view of the human mitochondrion (accessible under eukaryotic
organelles),
which can be viewed in its entirety or explored in progressively greater detail
(additional information about Entrez Genome <a
href="#EntrezGenome">above</a>).</li></ul></td>
</tr>
</table>
<a NAME="MapViewerHuman"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/mapview/map_search.cgi?taxid=9606">Map
Viewer</a>
- <b>integrated views of chromosome maps</b> - The Map Viewer is a software
component of Entrez Genome that displays one or more maps which have been
aligned
to each other based on shared marker and gene names, and, for the sequence maps,
based on a common sequence coordinate system. For human, the Map Viewer includes
>20
sequence, cytogenetic, genetic linkage, radiation hybrid, and other maps.
&nbsp;(When viewing a chromosome, use the "Maps & Options" dialog box to display
the
map(s) of interest.) &nbsp;&nbsp;The sequence maps are based on the contigs
built
from the draft and finished sequence data generated by the Human Genome Project.
A
<a href="/mapview/static/humansearch.html">list of available human maps</a> and
their descriptions is provided. The <a
href="/mapview/static/MapViewerHelp.html">Map Viewer help document</a> provides
general information on how to use that tool. Information about the <a
href="/genome/guide/build.html">NCBI Contig Assembly and Annotation Process</a>
is
also available.</li></ul></td>
</tr>
</table>
<a NAME="GeneMap"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genemap/">GeneMap'99</a> - physical map of
>35,000 human gene-based markers, constructed by the International Radiation
Hybrid
Mapping Consortium using a consistent set of RH reagents and methodologies.
Provides a framework for accelerated sequencing efforts by highlighting key
landmarks (gene-rich regions) of the chromosomes, and represents the cooperative
efforts of more than one hundred scientists throughout the world.<br>
<font size="-1"><i>Note:</i> The GeneMap'99 data have also been included in the
<a
href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
</table>
<a NAME="NCBI_RH_map"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/rhmap/">NCBI RH Map</a> - NCBI
Integrated
Radiation Hybrid Map</a> contains 23,723 markers from both the G3 and GB4
RH panels of GeneMap'99. Those markers were mapped with respect to 1084
framework markers (a subset of markers common to the G3 and GB4 panels).
All markers from both panels were interpolated onto the GB4 scale.
The article by <a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_
uids
=10720576&dopt=Abstract">R. Agarwala et al.</a> provides detail about
the integration strategy, as well as the methods used to evaluate the
quality of the integrated map.<br>
<font size="-1"><i>Note:</i> The NCBI RH Map data have also been included in the
<a
href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman">Mitelman Database of
Chromosome
Aberrations in Cancer</a> - genome-wide map of chromosomal breakpoints in human
cancer, by Drs. Mitelman, Mertens, and Johansson (eds),
http://cgap.nci.nih.gov/Chromosomes/Mitelman. This resource is associated with
the
<a href="/CGAP/">Cancer Genome Anatomy Project (CGAP)</a>. <!-- Original
version of
the aberration summary was published in a special issue of Nature Genetics, Vol.
15(Spec. No.):417-74 (April 1997). --><br>
<font size="-1"><i>Note:</i> The Mitelman data have also been included in the <a
href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Omim/getmap.cgi?">OMIM Gene Map</a> -
cytogenetic locations of genes that have been reported in the literature and
determined by a variety of mapping methods. Can be searched by gene symbol or
cytogenetic chromosomal location. Accessible from the OMIM page (described <a
href="#OMIM">above</a>).<br>
<font size="-1"><i>Note:</i> The OMIM Gene Map data have also been included in
the
<a href="/mapview/static/humansearch.html#Genes_Cytogenetic">Genes_Cytogenetic
Map</a> of the <a href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>
(described <a href="#EntrezMapViewer">above</a>). When viewing a chromosome,
use
the "Maps & Options" dialog box to select the map(s) of
interest.</font></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Omim/getmorbid.cgi">OMIM Morbid
Map</a> -
alphabetical listing of diseases and corresponding cytogenetic map locations,
with
links to OMIM entries. Accessible from the OMIM page (described <a
href="#OMIM">above</a>).<br>
<font size="-1"><i>Note:</i> The OMIM Morbid Map data have also been included in
the
<a href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
</table>
<a NAME="HumanMouseMap"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Homology/">Human-Mouse Homology Maps</a> - a
table comparing genes in homologous segments of DNA from human and mouse, sorted
by
position in each genome. Computed by integrating orthologs identified at The
Jackson Laboratory with putative orthologs identified by sequence homology.
The <a href="/Homology/Davis/">original maps</a> by M. F. Seldin
of the University of California at Davis are also available.</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: MAPPED MARKERS======== -->
<a NAME="HumanMappedMarkers"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Mapped Markers</FONT><img SRC="spacer10.GIF"
width="440"
height="1" border="0"></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/dbSTS/index.html">dbSTS</a> - Database of
Sequence Tagged Sites - short (about 200-500 bp) genomic sequences that are
thought
to be operationally unique in a genome, and therefore define a specific position
on
the physical map.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/unists/">UniSTS</a> - a unified,
non-redundant view of sequence tagged sites (STSs). UniSTS integrates marker
and
mapping data from a variety of public resources. If two or more markers have
different names but the same primer pair, a single STS record is presented for
the
primer pair and all the marker names are shown. Each UniSTS record displays the
primer sequences, product size, mapping information, and cross references to
Entrez Gene, dbSNP, RHdb, GDB, MGD, and the Map Viewer. The marker report also
lists
GenBank and RefSeq records that contain the primer sequences, as determined by
<a
href="#ePCR">Electronic PCR (e-PCR)</a>. Data sources include dbSTS, RHdb, GDB,
various human maps (Genethon genetic map, Marshfield genetic map, Whitehead RH
map,
Whitehead YAC map, Stanford RH map, NHGRI chr 7 physical map, WashU chrX
physical
map), various mouse maps (Whitehead RH map, Whitehead YAC map, Jackson
laboratory's
MGD map).</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/STS/">e-PCR (Electronic PCR)</a> - find
putative
map location of a query sequence. Computational procedure for finding sequence
tagged sites in DNA sequences. (See <a
href="#ePCR">additional information</a> in the Tools/Nucleotide Sequence Analysis
Section.)</ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genemap/">GeneMap'99</a> - physical map of
>35,000 human gene-based markers, constructed by the International Radiation
Hybrid
Mapping Consortium using a consistent set of RH reagents and methodologies.
Provides a framework for accelerated sequencing efforts by highlighting key
landmarks (gene-rich regions) of the chromosomes, and represents the cooperative
efforts of more than one hundred scientists throughout the world.<br>
<font size="-1"><i>Note:</i> The GeneMap'99 data have also been included in the
<a
href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/mapview/static/MVstart.html">Map Viewer</a> -
graphical views of <a href="/mapview/static/humansearch.html#ObjectTypeSTS">STS
markers placed on a variety of maps</a>, including sequence, genetic linkage,
and
radiation hybrid maps. (See additional information about Map Viewer, <a
href="#HumanChromosomeMapViews">above</a>.)</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Omim/getmap.cgi?">Rl2pl4tg
</a> -
cytogenetic locations of genes that have been reported in the literature and
determined by a variety of mapping methods. Can be searched by gene symbol or
cytogenetic chromosomal location. Accessible from the OMIM page (see Genes,
above).<br>
<font size="-1"><i>Note:</i> The OMIM Gene Map data have also been included in
the
<a href="/mapview/static/humansearch.html#Genes_Cytogenetic">Genes_Cytogenetic
Map</a> of the <a href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>
(described <a href="#EntrezMapViewer">above</a>). When viewing a chromosome,
use
the "Maps & Options" dialog box to select the map(s) of
interest.</font></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: CYTOGENETICS======== -->
<a NAME="HumanCytogenetics"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Cytogenetics</FONT><img SRC="spacer10.GIF"
width="440"
height="1" border="0"></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/cyto/hbrc.shtml">Human BAC
Resource</a> -
A cytogenetic resource of large-insert,
FISH-mapped clones containing sequence-tagged sites. Will help
integrate cytogenetic, radiation-hybrid, linkage, and sequence maps of
the human genome. Includes links to clone distributors.</li></ul></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="/entrez/query.fcgi?db=cancerchromosomes">Cancer
Chromosomes</a> - An Entrez database that integrates data from three sources:
the <a
href="/sky/">NCI/NCBI SKY/M-FISH & CGH Database</a>, the NCI <a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman/">Mitelman Database of
Chromosome
Aberrations in Cancer</a>, and the NCI <a
href="http://cgap.nci.nih.gov/Chromosomes/RecurrentAberrations/">Recurrent
Aberrations in Cancer</a>. Provides the ability to search for cytogenetic,
clinical, and/or reference information.</li></ul></td>
</tr>
</table>
<a NAME="SKY_CGH"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/sky/">SKY/M-FISH & CGH Database</a> - The NCI
and
NCBI SKY/M-FISH and CGH Database is a repository of publicly submitted data from
Spectral Karyotyping (SKY), Multiplex Fluorescence In Situ Hybridization
(M-FISH),
and Comparative Genomic Hybridization (CGH),
which are complementary fluorescent molecular cytogenetic techniques.
SKY/M-FISH permits the simultaneous visualization of each human
or mouse chromosome in a different color, facilitating the identification of
chromosomal aberrations; CGH can
be used to generate a map of DNA copy number changes in tumor genomes.
Collaborative
project with the National Cancer Institute. &nbsp;(<a
href="http://www.ncbi.nlm.nih.gov/sky/ccap_helper.cgi?tsc=0">data
submission instructions...</a>)</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman">Mitelman Database of
Chromosome
Aberrations in Cancer</a> - genome-wide map of chromosomal breakpoints in human
cancer, by Drs. Mitelman, Mertens, and Johansson (eds),
http://cgap.nci.nih.gov/Chromosomes/Mitelman. This resource is associated with
the
<a href="/CGAP/">Cancer Genome Anatomy Project (CGAP)</a>. <!-- Original
version of
the aberration summary was published in a special issue of Nature Genetics, Vol.
15(Spec. No.):417-74 (April 1997). --><br>
<font size="-1"><i>Note:</i> The Mitelman data have also been included in the <a
href="/mapview/map_search.cgi?taxid=9606">Map Viewer</a>, described <a
href="#EntrezMapViewer">above</a>. When viewing a chromosome, use the "Maps &
Options" dialog box to select the map(s) of interest.</font></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: GENE EXPRESSION======== -->
<a NAME="HumanGeneExpression"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Gene Expression</FONT><img SRC="spacer10.GIF"
width="440"
height="1" border="0"></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="GEO"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/geo/">Gene Expression Omnibus (GEO)</a> - a
gene
expression and hybridization array data repository, as well as a curated, online
resource for gene expression data browsing, query and retrieval. GEO was the
first
fully public high-throughput gene expression data repository, and became
operational
in July 2000. Many types of gene expression data from platforms such as spotted
microarray (microarray), high-density oligonucleotide array (HDA), hybridization
filter (filter) and serial analysis of gene expression (SAGE) data, are
accepted,
accessioned, and archived as a public data set. GEO data can be accessed
through
several search and browsing tools on the <a href="/geo/">GEO home page</a>, <a
href="/Entrez/">Entrez</a> (via <a href="/entrez/query.fcgi?db=geo">Entrez GEO
Profiles</a> and <a href="/entrez/query.fcgi?db=gds">Entrez GDS (GEO
DataSets)</a>),
and the <a href="ftp://ftp.ncbi.nih.gov/pub/geo/">FTP site</a>. The Tools/Gene
Expression section of this file provides information about <a
href="#GeneExpressionTools">data visualization and exploration capabilities</a>
available in GEO.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/ncicgap/">CGAP</a> - Cancer Genome Anatomy
Project - interdisciplinary program to identify the human genes expressed in
different cancerous states, based on cDNA (EST) libraries, and to determine the
molecular profiles of normal, precancerous, and malignant cells. Collaboration
among the National Cancer Institute, the NCBI, and numerous research labs.
Additional information about CGAP is provided in the <a
href="#GeneExpressionTools">Tools/Gene Expression</a> section of this file.
Related
resources are described in the <a href="#CancerResearch">Human Genome/Cancer
Research</a> section.</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/SAGE/">SAGEmap</a> - Serial Analysis of Gene
Expression, or SAGE, is an experimental technique designed to quantitatively
measure
gene expression. SAGEmap is an online tool to compare computed gene expression
profiles between SAGE libraries generated by the Cancer Genome Anatomy Project
(<b>CGAP</b>, <a href="#CGAP">described</a> under <a
href="#CancerResearch">human
genome/cancer research</a>) and submitted by others through the Gene Expression
Omnibus (<b>GEO</b>, <a href="#GEO">described</a> under <a
href="#Expression">molecular databases/gene expression</a>). SAGEmap also
includes a
comprehensive analysis of SAGE tags in human GenBank records, in which a UniGene
identifier is assigned to each human sequence that contains a SAGE tag. Data can
be
retrieved by tag, by sequence, by UniGene cluster ID and by library name. When
retrieving data by sequence or UniGene cluster ID, follow a SAGE tag's hotlink
to
find out its expression level in different SAGE libraries, and how it is
represented
in the rest of the sequences in GenBank. Retrieving data by library name takes
one
to GEO, where all SAGEmap data has been stored by library. Analytical tools
include
<a href="/SAGE/index.cgi?cmd=expsetup">xProfiler</a>, which compares gene
expression
between SAGE libraries of your choice as well as uploaded data. More
information
about the additional analytical capabilities of the SAGEmap resource is provided
in
the <a href="#GeneExpressionTools">tools/gene expression</a> section of this
file.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/UniGene/ddd.cgi?ORG=Hs">UniGene DDD</a> -
Digital
Differential Display - an online tool to compare computed gene expression
profiles
between selected cDNA libraries. Using a statistical test, genes whose
expression
levels differ significantly from one tissue to the next are identified and shown
to
the user. Additional information about UniGene is <a
href="#UniGene">above</a>.</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: GENETIC VARIATION======== -->
<a NAME="HumanGeneticVariation"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Genetic Variation</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/SNP/">dbSNP</a> - Database of Single
Nucleotide
Polymorphisms - NCBI database of single nucleotide polymorphisms,
microsatellites,
and small-scale insertions and deletions. dbSNP contains population-specific
frequency and genotype data, experimental conditions, molecular context, and
mapping
information for both neutral polymorphisms and clinical
mutations.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=OMIM">OMIM</a> - Online
Mendelian Inheritance in Man - allelic variants in ~900 (9%) of OMIM records.
To
view a list of those OMIM records, use the <a
href="/entrez/query.fcgi?CMD=Limits&DB=omim">OMIM Limits page in Entrez</a> and
check the box for "Allelic Variants" under the section titled "Only records
with."
(For more information about OMIM, see <a href="#HumanGenes">Genes</a>,
above.)</ul></td>
</tr>
</table>
<a NAME="MutationDatabases"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/Omim/allresources.html">Locus Specific
Mutation
Databases</a> - links to numerous external mutation databases are provided on
the
OMIM allied resources page and from related <a href="#OMIM">OMIM</a> entries.
When
an individual OMIM entry contains links to locus-specific mutation databases,
the
links are shown under the "LinkOut" header in the blue sidebar. (The LinkOut
header
appears only in entries that have links to resources outside of the Entrez
system.)</ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: DISORDERS======== -->
<a NAME="HumanDisorders"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Disorders</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="GenesAndDisease"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="/books/bv.fcgi?rid=gnd">Genes and Disease</a> - introduction to the
relationship between genetic factors and human disease. Summary information for
~60 genetic diseases with links to related databases and
organizations.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman">Mitelman Database of
Chromosome
Aberrations in Cancer</a> - genome-wide map of chromosomal breakpoints in human
cancer, by Drs. Mitelman, Mertens, and Johansson (eds),
http://cgap.nci.nih.gov/Chromosomes/Mitelman. This resource is associated with
the
<a href="/CGAP/">Cancer Genome Anatomy Project (CGAP)</a>. <!-- Original
version of
the aberration summary was published in a special issue of Nature Genetics, Vol.
15(Spec. No.):417-74 (April 1997). --></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=OMIM">OMIM</a> - Online
Mendelian Inheritance in Man - continuously updated catalog of human genes and
genetic disorders, with links to associated literature references, sequence
records,
maps, and related databases.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Omim/getmorbid.cgi">OMIM Morbid
Map</a> -
alphabetical listing of diseases and corresponding cytogenetic map locations,
with
links to OMIM entries. Accessible from OMIM page (see Genes).</li></ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: CANCER RESEARCH======== -->
<a NAME="CancerResearch"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">Cancer Research</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="CancerChromosomes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="/entrez/query.fcgi?db=cancerchromosomes">Cancer
Chromosomes</a> - An Entrez database that integrates data from three sources:
the <a
href="/sky/">NCI/NCBI SKY/M-FISH & CGH Database</a>, the NCI <a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman/">Mitelman Database of
Chromosome
Aberrations in Cancer</a>, and the NCI <a
href="http://cgap.nci.nih.gov/Chromosomes/RecurrentAberrations/">Recurrent
Aberrations in Cancer</a>. Provides the ability to search for cytogenetic,
clinical, and/or reference information.</li></ul></td>
</tr>
</table>
<a NAME="CCAP"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://cgap.nci.nih.gov/Chromosomes/CCAP">CCAP</a>
- Cancer Chromosome Aberration Project - designed to expedite the definition and
detailed characterization of the distinct chromosomal alterations that are
associated with malignant transformation. Collaboration among the National
Cancer
Institute, the NCBI, and numerous research labs.</li></ul></td>
</tr>
</table>
<a NAME="CGAP"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/ncicgap/">CGAP</a> - Cancer Genome Anatomy
Project - interdisciplinary program to identify the human genes expressed in
different cancerous states, based on cDNA (EST) libraries, and to determine the
molecular profiles of normal, precancerous, and malignant cells. Collaboration
among the National Cancer Institute, the NCBI, and numerous research labs.
Additional information about CGAP is provided in the <a
href="#GeneExpressionTools">Tools/Gene Expression</a> section of this
file.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://cgap.nci.nih.gov/Chromosomes/Mitelman">Mitelman Database of
Chromosome
Aberrations in Cancer</a> - genome-wide map of chromosomal breakpoints in human
cancer, by Drs. Mitelman, Mertens, and Johansson (eds),
http://cgap.nci.nih.gov/Chromosomes/Mitelman. This resource is associated with
the
<a href="/CGAP/">Cancer Genome Anatomy Project (CGAP)</a>. <!-- Original
version of
the aberration summary was published in a special issue of Nature Genetics, Vol.
15(Spec. No.):417-74 (April 1997). --></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/sky/">SKY/M-FISH & CGH Database</a> - The NCI
and
NCBI SKY/M-FISH and CGH Database is a repository of publicly submitted data from
Spectral Karyotyping (SKY), Multiplex Fluorescence In Situ Hybridization
(M-FISH),
and Comparative Genomic Hybridization (CGH),
which are complementary fluorescent molecular cytogenetic techniques. SKY/M-FISH
permits the simultaneous visualization of each human
or mouse chromosome in a different color, facilitating the identification of
chromosomal aberrations; CGH can
be used to generate a map of DNA copy number changes in tumor genomes.
Collaborative
project with the National Cancer Institute. &nbsp;(<a
href="http://www.ncbi.nlm.nih.gov/sky/ccap_helper.cgi?tsc=0">data submission instructions...</a>)</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/SAGE/">SAGE Analysis</a> - differential
expression of SAGE tags in cancer libraries. (See additional information about
SAGEmap, <a href="#SAGEmap">below</a>.) </ul></td>
</tr>
</table>
<!-- ========HUMAN GENOME SUB-CATEGORY: FTP======== -->
<a NAME="HumanGenomeFTP"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">FTP</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/">Human
chromosome data</a> - the ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/ directory
contains one folder for each chromosome, which includes genomic contigs (NT_*
records) built from finished and unfinished sequence data. The contigs are
available in various formats, described below. The <a
href="/genome/guide/build.html">contig assembly and annotation process</a> is
described in a separate document.
<table border="0" cellpadding="0">
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.asn</td>
<td CLASS="TEXT" width="79%" valign="top">ASN.1 format (description <a
href="#ASN1">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.fa.gz</td>
<td CLASS="TEXT" width="79%" valign="top">FASTA format (description <a
href="#FASTA">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.gbk.gz</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank flat file format<br>
(annotations currently include STS markers; known and
predicted genes will be added in coming months)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.gbs</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank summary format<br>
(this format does not contain sequence data, but instead
contains a "CONTIG" field, showing how the contig is assembled
from individual GenBank accessions)</td>
</tr>
</table>
Data from the Map Viewer (described <a
href="#HumanChromosomeMapViews">above)</a>,
is available in the <a
href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/maps/mapview/">ftp://ftp.ncbi.nih
.gov
/genomes/H_sapiens/maps/mapview/</a> subdirectory.
</ul></td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: MOUSE======== -->
<a NAME="MouseGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Mouse Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#MouseGenomeGuide">Guide</a>, &nbsp;
<a href="#MouseGenomeChromosomes">Chromosomes</a>, &nbsp;
<a href="#MouseGenomeSequences">Sequences</a>, &nbsp;
<a href="#MouseGenomeGenes">Genes</a>, &nbsp;
<a href="#MouseGenomeClones">Clones</a>, &nbsp;
<a href="#MouseGenomeMaps">Maps and Mapped Markers</a>, &nbsp;
<a href="#MouseGenomeCytogenetics">Cytogenetics</a>, &nbsp;
<a href="#MouseGenomeBLAST">BLAST</a>, &nbsp;
<a href="#MouseGenomeFTP">FTP</a>
</td>
<td CLASS="TEXT" WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: GUIDE======== -->
<a NAME="MouseGenomeGuide"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Guide</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/guide/mouse/">Mouse Genome Resources
Guide</a> - brings together information on diverse mouse-related resources from
multiple centers: sequence, mapping, and clone information as well as pointers
to
strain and mutant resources.</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: CHROMOSOMES======== -->
<a NAME="MouseGenomeChromosomes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Chromosomes</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="MouseGenomeSequencing"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/seq/MmHome.html">Mouse Genome
Sequencing</a> - sequencing progress of the Mouse Genome Project;
high-throughput
genomic sequence contigs assembled from finished (phase 3) data; view by
chromosome
number, size and physical position; download sequence data by contig or by
chromosome; BLAST against contigs.</li></ul></td>
</tr>
</table>
<a NAME="MapViewerMouse"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/mapview/map_search.cgi?taxid=10090">Map
Viewer</a> - <b>integrated chromosome maps</b> - The Map Viewer is a software
component of Entrez Genome that displays one or more maps which have been
aligned
to each other based on shared marker and gene names,
and, for the sequence maps, based on a common sequence coordinate system.
The maps that are currently available for <i>Mus musculus</i> are described in
the
<a href="/mapview/static/mousesearch.html"><i>Mus musculus</i> data and search
tips</a> document. The <a href="/mapview/static/MapViewerHelp.html">Map Viewer
help
document</a> provides general information on how to use that
tool.</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: SEQUENCES======== -->
<a NAME="MouseGenomeSequences"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Sequences</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/seq/MmHome.html">Mouse Genome
Sequencing</a> - sequencing progress of the Mouse Genome Project;
high-throughput
genomic sequence contigs assembled from finished (phase 3) data; view by
chromosome
number, size and physical position; download sequence data by contig or by
chromosome; BLAST against contigs.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Entrez/">Entrez</a> - includes sequence data
from
>160,000 species; use the organism field to limit searches to mouse records.
See additional information about Entrez <a href="#Entrez">above</a>, and Batch
Entrez, <a href="#BatchEntrez">below</a>.</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: GENES======== -->
<a NAME="MouseGenomeGenes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Genes</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> -
a gene-based view of the data from a wide range of genomes, including mouse. It
supplies key connections in the nexus of map, sequence, expression, structure,
functional, and homology data. <a href="#EntrezGene">More information about
Entrez Gene</a> is provided above, in the Molecular Databases/Genes section.
</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/UniGene/Mm.Home.html">UniGene</a> - ESTs and
full-length mRNA sequences organized into clusters that each represent a unique
known or putative gene within the organism from which the sequences were
obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/HomoloGene/">HomoloGene</a> - a gene homology
tool that compares nucleotide sequences between pairs of organisms, including
human,
mouse, rat, zebrafish, and fruit fly, in order to identify putative orthologs.
Curated orthologs are incorporated from a variety of sources via
Entrez Gene.</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: CLONES======== -->
<a NAME="MouseGenomeClones"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Clones</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/clone/">CloneRegistry</a> - a database
used by participating <a href="/genome/seq/HsCenters.html">human genome
sequencing
centers</a> and <a href="/genome/seq/MmSeqCenters.html">mouse genome sequencing
centers</a> to record which clones have been selected for sequencing, which are
currently in the pipeline, and which are finished and represented by sequence
entries in GenBank. Includes BACs, PACs, cosmids, fosmids. Uses <a
href="/genome/clone/nomenclature.shtml">standardized clone names</a> that
represents
a clone's microtitre plate address (plate number, row, and column) prefixed by a
library abbreviation, to produce unique names. Includes <a
href="/genome/clone/ordering.html">clone ordering
information</a>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="http://mgc.nci.nih.gov/">Mammalian Gene
Collection
(MGC)</a> - The NIH Mammalian Gene Collection (MGC) is a trans-NIH initiative
that
seeks to identify and sequence a representative full open reading frame (FL-ORF)
clone for each human, mouse, and rat gene. Additional information is provided
<a
href="#MGC">above</a>.</li></ul></td>
</tr>
</table>
<a NAME="MouseGenomeCloneFinder"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="/genome/clone/clonefinder/CloneFinder.html">CloneFinder</a> - allow users
to
identify clones that contain an object, or that are contained within a
particular
genomic region. Clone placement is based on the alignment of BAC end sequences
(BES)
to the current genome assembly. Currently, CloneFinder is available for mouse
only.</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: MAPS======== -->
<a NAME="MouseGenomeMaps"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Maps and Mapped Markers</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="MapViewerMouse"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/mapview/map_search.cgi?taxid=10090">Map
Viewer</a> - <b>integrated chromosome maps</b> - The Map Viewer is a software
component of Entrez Genome that displays one or more maps which have been
aligned
to each other based on shared marker and gene names,
and, for the sequence maps, based on a common sequence coordinate system.
The maps that are currently available for <i>Mus musculus</i> are described in
the
<a href="/mapview/static/mousesearch.html"><i>Mus musculus</i> data and search
tips</a> document. The <a href="/mapview/static/MapViewerHelp.html">Map Viewer
help
document</a> provides general information on how to use that
tool.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/Homology/">Human-Mouse Homology Maps</a> - a
table comparing genes in homologous segments of DNA from human and mouse, sorted
by
position in each genome. Computed by integrating orthologs identified at The
Jackson Laboratory with putative orthologs identified by sequence homology.
The <a href="/Homology/Davis/">original maps</a> by M. F. Seldin
of the University of California at Davis are also available.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/unists/">UniSTS</a> - a unified,
non-redundant view of sequence tagged sites (STSs). UniSTS integrates marker
and
mapping data from a variety of public resources. If two or more markers have
different names but the same primer pair, a single STS record is presented for
the
primer pair and all the marker names are shown. Each UniSTS record displays the
primer sequences, product size, mapping information, and cross references to
Entrez Gene, dbSNP, RHdb, GDB, MGD, and the Map Viewer. The marker report also
lists
GenBank and RefSeq records that contain the primer sequences, as determined by
<a
href="#ePCR">Electronic PCR (e-PCR)</a>. Data sources include dbSTS, RHdb, GDB,
various human maps (Genethon genetic map, Marshfield genetic map, Whitehead RH
map,
Whitehead YAC map, Stanford RH map, NHGRI chr 7 physical map, WashU chrX
physical
map), various mouse maps (Whitehead RH map, Whitehead YAC map, Jackson
laboratory's
MGD map).</li></ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: CYTOGENETICS======== -->
<a NAME="MouseGenomeCytogenetics"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">Cytogenetics</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/sky/">SKY/M-FISH & CGH Database</a> - The NCI
and
NCBI SKY/M-FISH and CGH Database is a repository of publicly submitted data from
Spectral Karyotyping (SKY), Multiplex Fluorescence In Situ Hybridization
(M-FISH),
and Comparative Genomic Hybridization (CGH),
which are complementary fluorescent molecular cytogenetic techniques.
SKY/M-FISH permits the simultaneous visualization of each human
or mouse chromosome in a different color, facilitating the identification of
chromosomal aberrations; CGH can
be used to generate a map of DNA copy number changes in tumor genomes.
Collaborative
project with the National Cancer Institute. &nbsp;(<a
href="http://www.ncbi.nlm.nih.gov/sky/ccap_helper.cgi?tsc=0">data
submission instructions...</a>)</ul></td>
</tr>
</table>
<!-- ========MOUSE GENOME SUB-CATEGORY: BLAST======== -->
<a NAME="MouseGenomeBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3" WIDTH="95%">BLAST</FONT></td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/seq/MmBlast.html">BLAST against the
mouse
genome</a> - &nbsp; Nucleotide or protein query sequences can be used. A <a
href="/genome/seq/Database.html">variety of database choices</a> are provided.
</ul></td>
</tr>
<!-- tr>
<td CLASS="TEXT"><ul><li><a href="/BLAST/">BLAST against mouse ESTs</a> -
compare a
nucleotide or protein sequence against the human ESTs by choosing
<b>est_mouse</b>
as the database to be searched when using Nucleotide BLAST or Translated
BLAST</ul></td>
</tr -->
</table>
<p></p>
<!-- ========MOUSE GENOME SUB-CATEGORY: FTP======== -->
<a NAME="MouseGenomeFTP"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">FTP</td>
<td WIDTH="5%" BGCOLOR="#FFFFFF" VALIGN="top">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nih.gov/genomes/M_musculus/">Mouse
chromosome data</a> - the ftp://ftp.ncbi.nih.gov/genomes/M_musculus/ directory
contains one folder for each chromosome, which includes genomic contigs (NT_*
records) built from finished sequence data. The contigs are available in
various
formats: <!-- The <a href="/genome/guide/build.html">contig assembly and
annotation
process</a> is described in a separate document. -->
<table border="0" cellpadding="0">
<tr>
<td CLASS="TEXT" width="21%" valign="top">mm_chr*.asn</td>
<td CLASS="TEXT" width="79%" valign="top">ASN.1 format (description <a
href="#ASN1">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">mm_chr*.fa.gz</td>
<td CLASS="TEXT" width="79%" valign="top">FASTA format (description <a
href="#FASTA">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">mm_chr*.gbk.gz</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank flat file format<br>
(annotations currently include STS markers; known and
predicted genes will be added in coming months)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">mm_chr*.gbs</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank summary format<br>
(this format does not contain sequence data, but instead
contains a "CONTIG" field, showing how the contig is assembled
from individual GenBank accessions)</td>
</tr>
</table>
See additional information about the genomes FTP directories, <a
href="#FTP_OtherGenomes">below</a>
</ul></td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: RAT======== -->
<a NAME="RatGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Rat Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="RatGenomeGuide"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genome/guide/rat/">Rat Genome Resources Guide</a> -
brings together information on diverse rat-related resources from multiple
centers:
sequence, mapping, and clone information as well as pointers to strain and
mutant
resources.</td>
</tr>
</table>
<a NAME="MapViewerRat"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=10116">Map Viewer</a> -
<b>integrated chromosome maps</b> - The Map Viewer is a software component of
Entrez
Genomes that displays one or more maps which have been aligned to each other
based
on shared marker and gene names,
and, for the sequence maps, based on a common sequence coordinate system.
The maps that are currently available for rat are described in the <a
href="/mapview/static/ratsearch.html"><i>Rattus norvegicus</i> data and search
tips</a> document. The <a href="/mapview/static/MapViewerHelp.html">Map Viewer
help
document</a> provides general information on how to use that tool.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - a
gene-based view of the data from a wide range of genomes, including rat. It
supplies key connections in the nexus of map, sequence, expression, structure,
functional, and homology data. <a href="#EntrezGene">More information about
Entrez Gene</a> is provided above, in the Molecular Databases/Genes section.
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/genome/seq/RnBlast.html">BLAST against the rat
genome</a>
- &nbsp; Nucleotide or protein query sequences can be used. A <a
href="/genome/seq/Database.html">variety of database choices are provided</a>.
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/UniGene/Rn.Home.html">UniGene</a> - ESTs and
full-length
mRNA sequences organized into clusters that each represent a unique known or
putative gene within the organism from which the sequences were obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms, including human,
mouse,
rat, zebrafish, and fruit fly, in order to identify putative orthologs. Curated
orthologs are incorporated from a variety of sources via Entrez Gene.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: COW======== -->
<a NAME="CowGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Cow Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/Rn.Home.html">UniGene</a> - ESTs and
full-length
mRNA sequences organized into clusters that each represent a unique known or
putative gene within the organism from which the sequences were obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: ZEBRAFISH======== -->
<a NAME="ZebrafishGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Zebrafish Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="ZebrafishGenomeGuide"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genome/guide/zebrafish/">Zebrafish Genome Resources
Guide</a> - brings together information on diverse zebrafish-related resources
from
multiple centers: sequence, mapping, and clone information as well as pointers
to
strain and mutant resources.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - a
gene-based view of the data from a wide range of genomes, including zebrafish.
It supplies key connections in the nexus of map, sequence, expression,
structure, functional, and homology data. <a href="#EntrezGene">More
information about Entrez Gene</a> is provided above, in the Molecular
Databases/Genes section.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=7955">Map Viewer</a> -
<b>integrated chromosome maps</b> - The Map Viewer is a software component of
Entrez
Genomes that displays one or more maps which have been aligned to each other
based
on shared marker and gene names,
and, for the sequence maps, based on a common sequence coordinate system.
The maps that are currently available for <i>Danio rerio</i> are described in
the <a
href="/mapview/static/daniosearch.html"><i>Danio rerio</i> genome data and
search
tips</a> document. The <a href="/mapview/static/MapViewerHelp.html">Map Viewer
help
document</a> provides general information on how to use that tool.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/UniGene/Dr.Home.html">UniGene</a> - ESTs and
full-length
mRNA sequences organized into clusters that each represent a unique known or
putative gene within the organism from which the sequences were obtained.
Additional information about UniGene is provided <a
href="#UniGene">above</a><!--
(in both the "<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes and Maps</a>/<a href="#MultipleOrganisms">Organism
Collections</a>" sections) -->.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms, including human,
mouse,
rat, zebrafish, and fruit fly, in order to identify putative orthologs. Curated
orthologs are incorporated from a variety of sources via Entrez Gene.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: DROSOPHILA======== -->
<a NAME="DrosophilaGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3"><i>Drosophila</i>
Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=7227"><i>Drosophila
melanogaster</i> Home Page</a> - provides an overview of available resources for
that organism, graphically displays all the chromosomes (to scale), and allows
you
search both cytogenetic and sequence data across the whole genome through the
Entrez
Genomes browser. Entrez Genome presents a unified graphical
view of maps (genetic and physical) and sequence data for an organism. After
you
search for a term such as a gene symbol, it presents a graphic Genome View of
search
results, from which you can zoom into progressively more detailed Map Views of
the
region of interest, and link to sequence data and associated resources that
contain
additional detail.</td>
</tr>
</table>
<a NAME="MapViewerDrosophila"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=7227">Map Viewer</a> -
<b>integrated chromosome maps</b> - The Map Viewer is a software component of
Entrez
Genomes that displays one or more maps which have been aligned to each other
based
on shared marker and gene names,
and, for the sequence maps, based on a common sequence coordinate system.
The sequence and cytogenetic maps that are currently available for
<i>Drosophila</i>
are described in the <a
href="/mapview/static/drosophilasearch.html"><i>Drosophila
melanogaster</i> genome data and search tips</a> document. The <a
href="/mapview/static/MapViewerHelp.html">Map Viewer help document</a> provides
general information on how to use that tool.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - a
gene-based view of the data from a wide range of genomes, including
<i>Drosophila</i>. It
supplies key connections in the nexus of map, sequence, expression, structure,
functional, and homology data. <a href="#EntrezGene">More information about
Entrez Gene</a> is provided above, in the Molecular Databases/Genes section.
</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/HomoloGene/">HomoloGene</a> - a gene homology tool
that
compares nucleotide sequences between pairs of organisms, including human,
mouse,
rat, zebrafish, and fruit fly, in order to identify putative orthologs. Curated
orthologs are incorporated from a variety of sources via Entrez Gene.</td>
</tr>
</table>
<a NAME="DrosophilaGenomeBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/">BLAST against <i>Drosophila melanogaster</i>
genome sequence</a>
<ul>
<li>select <b><i>Drosophila</i> genome</b> as the target database when using the
nucleotide BLAST, protein BLAST, or translated BLAST search pages. &nbsp;Or,
<li>check the box for <i>Drosophila melanogaster</i> in the list of organisms on
the
<a href="/cgi-bin/Entrez/genom_table_cgi?organism=euk">BLAST with Eukaryotic
genomes</a> page.
</ul>
</td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/D_melanogaster/">FTP Site</a> - see
additional information about the genomes FTP directories, <a
href="#FTP_OtherGenomes">below</a></td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: NEMATODE======== -->
<a NAME="NematodeGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Nematode Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=6239"><i>Caenorhabditis
elegans</i> Home Page</a> - Graphical representation of chromosomes that can be
viewed in their entirety or explored in progressively greater detail in the
<b>Map
Viewer</b> (described <a href="#MapViewer">above</a>). Home page also includes
links to many related resources, such as sequencing centers, other nematode
sequencing projects, related databases, etc.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genomes/C_elegans/">FTP
Site</a> -
the chromosome data sets are available for ftp in a variety of formats,
including
GenBank, FastA, and ASN.1, and others in the <a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/C_elegans/">genbank/genomes/C_elega
ns/<
/a> directory of the NCBI FTP site (ftp://ftp.ncbi.nih.gov/). An NCBI curated
version of the data is available in the <a
href="ftp://ftp.ncbi.nih.gov/genomes/C_elegans/">genomes/C_elegans/</a>
directory.
&nbsp;(See additional note in the FTP section, <a
href="#FTP_OtherGenomes">below</a>, about the two different FTP directories)<td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: PLANTS======== -->
<a NAME="PlantGenomes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Plant Genomes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/PLANTS/PlantList.html">Plant Genomes
Central</a>
- provides access to data from large-scale sequencing projects, genetic maps,
and
large-scale EST sequencing projects. All organism names on the page are linked
to
the corresponding taxonomic information in NCBI's <b>Taxonomy database</b>
(described <a href="#Taxonomy">above</a>). In addition, organisms listed under
"large-scale sequencing projects" and "genetic maps" are represented in the
<b>Map
Viewer</b> (described <a href="#MapViewer">above</a>). Organisms listed under
"large-scale EST sequencing projects" are linked to their EST sequences in
<b>Entrez</b> (described <a href="#Entrez">above</a>).</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/UniGene/">UniGene</a> - ESTs and full-length mRNA
sequences organized into clusters that each represent a unique known or putative
gene within the organism from which the sequences were obtained. Additional
information about UniGene is provided <a href="#UniGene">above</a><!-- (in both
the
"<a href="#Nucleotides">Nucleotide Sequences</a>" and "<a
href="#Genomes">Genomes
and Maps</a>/<a href="#MultipleOrganisms">Organism Collections</a>" sections)
-->.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: YEAST======== -->
<a NAME="YeastGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Yeast Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/map_search.cgi?taxid=4932"><i>Saccharomyces
cerevisiae</i> Home Page</a> - baker's yeast - graphical representation of
chromosomes that can be viewed in their entirety or explored in progressively
greater detail in <b>Entrez Genome</b> (described <a
href="#EntrezGenome">above</a>), with links to associated sequence data. Home
page
also includes links to many related resources, such as sequencing centers, other
fungi sequencing projects, related databases, etc.</td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="/mapview/map_search.cgi?taxid=4896"><i>Schizosaccharomyces
pombe</i> Home Page</a> - fission yeast - similar to the home page for
<i>Saccharomyces cerevisiae</i>, described above.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/COG/">COGs - Clusters of Orthologous Groups</a> -
natural
system of gene families from complete genomes. Clusters of Orthologous Groups
(COGs) were delineated by comparing protein sequences encoded in complete
unicellular genomes representing 30 major phylogenetic lineages. Each COG
consists
of individual proteins or groups of paralogs from at least 3 lineages and thus
corresponds to an ancient conserved domain. The <a href="/COG/old/">Initial
Version</a> of COGs includes 44 organisms. The <a href="/COG/new/">Updated
Version</a> of COGs includes 66 organisms in the <a
href="/COG/new/release/phylox.cgi">Unicellular Clusters</a>, plus <a
href="/COG/new/shokog.cgi">Eukaryotic Clusters</a> (called KOGs). More
organisms
will be added in the future.</td>
</tr>
</table>
<a NAME="YeastGenomeBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/">BLAST against the <i>Saccharomyces
cerevisiae</i>
or <i>Schizosaccharomyces pombe</i> genome sequences</a>
<ul>
<li>check the box for <i>Saccharomyces cerevisiae</i> and/or
<i>Schizosaccharomyces
pombe</i> in the list of organisms on the <a
href="/cgi-bin/Entrez/genom_table_cgi?organism=euk">BLAST with Eukaryotic
genomes</a> page. &nbsp;&nbsp;OR
<li>select <b>yeast</b> as the target database when using the nucleotide BLAST,
protein BLAST, or translated BLAST search pages; this searches only
<i>Saccharomyces
cerevisiae</i> data, however.
</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/S_cerevisiae/">FTP
<i>Saccharomyces cerevisiae</i> Chromosomes</a></td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: MALARIA======== -->
<a NAME="MalariaGenome"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Malaria Genome</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/projects/Malaria/">Malaria Genetics & Genomics</a> -
provides data and information relevant to malaria genetics and genomics.
Resources
include organism specific sequence BLAST databases (<i>Plasmodium falciparum</i>
only, all <i>Plasmodium</i>, and all <i>Toxoplasma</i>), genome maps, linkage
markers, and information about genetic studies. Links are provided for other
malaria
web sites and genetic data on related apicomplexan parasites, including
Toxoplasma
gondii.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="mapview">Map Viewer</a> - The Map Viewer (described <a
href="#MapViewer">above</a>) provides graphical views and search capabilities
for
both <a href="/mapview/map_search.cgi?taxid=5833"><i>Plasmodium
falciparum</i></a>
and <a href="/mapview/map_search.cgi?taxid=7165"><i>Anopheles gambiae</i>
(malaria
mosquito)</a>.</td>
</tr>
</table>
<a NAME="MalariaGenomeBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/projects/Malaria/blastindex.html">BLAST against
Malaria
sequences</a>
<ul>
<li>The malaria genome sequence (both <i>Plasmodium yoelii</i> shotgun
sequences and <i>Plasmodium falciparum</i> complete finished sequence)
as submitted to GenBank is available at the <a
href="/BLAST/Genome/plasmodium.html">Entrez
Genome Malaria BLAST</a> page.</li>
<li>Prior release and unsubmitted sequence
data to GenBank is available from the <a
href="/projects/Malaria/plasmodiumblcus.html">NCBI
Malaria Genetics &amp; Genomics Custom BLAST</a> page.</li>
<!-- old wording: li><a href="/projects/Malaria/plasmodiumblcus.html">Custom
BLAST</a> provides access to unfinished sequences "pulled" from the respective
P.
falciparum 3D7 Genome Sequencing Center's FTP sites: Sanger Centre, Stanford,
and
TIGR or the Gene Sequencing Tag Project at the Univ of Florida. </li -->
</ul>
</td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="ftp://ftp.ncbi.nih.gov/genomes/Plasmodium_falciparum/">FTP</a>
<ul>
<li><a href="ftp://ftp.ncbi.nih.gov/pub/Malaria/">Electronic PCR (e-PCR)
program
for finding STSs in DNA sequences, Malaria Packaging Version</a> - Standalone
Program. (See <a
href="#ePCR">additional information</a> in the Tools/Nucleotide Sequence Analysis
Section.)</li>
<li><a href="ftp://ftp.ncbi.nih.gov/genomes/Plasmodium_falciparum/">download
completed chromosome sequence data</a></li>
<!-- old: li><a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/P_falciparum/">download sequence
data
from completed chromosomes</a> (currently chromosomes 2 and 3) in a variety of
formats, including GenBank flat file (*.gbk), GenBank summary file (*.gbs),
FASTA
Nucleic Acid file (*.fna), FASTA Amino Acid file (*.faa), Protein Table (*.ptt),
and
others.</li -->
</ul>
</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: MICROBIAL GENOMES======== -->
<a NAME="MicrobialGenomes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Microbial Genomes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Genome">Entrez Genome</a> -
Graphical representation of complete bacterial genomes that can be viewed in
their
entirety or explored in progressively greater detail; links to associated
sequence
data. A "ProtTable" of protein coding genes is provided for each bacterium.
There
are also links to a "TaxTable," showing the distribution of BLAST protein
homologs
by taxa (sequences grouped by superkingdom), and to a distribution of BLAST
protein
homologs by 3-D structure (sequences with known structure). Additional
information
about Entrez Genome is also provided <a href="#EntrezGenome">above</a>.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?DB=genomeprj">Entrez Genome
Project</a> -
provides an umbrella view of the status of a wide range of genome projects,
and includes information about <a href="/genomes/MICROBES/microbial_taxtree.html">microbial
genome sequencing projects</a>. Tabs allow you to switch between lists of
completed
and in-progress microbial genome projects. The list of completed genomes
includes
links to NCBI graphical views of the data (in Entrez Genome), sequencing
centers, and
the results of various analyses that have been done on the genomes at NCBI
(e.g., TaxTable, COG Table,
3-D Neighbors, and more). The list of in-progress sequencing projects includes
links
to sequencing centers and, when available, to BLASTable data. A
<a href="#EntrezGenomeProject">more detailed description of the Entrez Genome
Project
database</a> is provided in the section on <a href="#Genomes">Genomes and
Maps/Organism Collections</a>.
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/COG/">COGs - Clusters of Orthologous Groups</a> -
natural
system of gene families from complete genomes. Clusters of Orthologous Groups
(COGs) were delineated by comparing protein sequences encoded in complete
unicellular genomes representing 30 major phylogenetic lineages. Each COG
consists
of individual proteins or groups of paralogs from at least 3 lineages and thus
corresponds to an ancient conserved domain. The <a href="/COG/old/">Initial
Version</a> of COGs includes 44 organisms. The <a href="/COG/new/">Updated
Version</a> of COGs includes 66 organisms in the <a
href="/COG/new/release/phylox.cgi">Unicellular Clusters</a>, plus <a
href="/COG/new/shokog.cgi">Eukaryotic Clusters</a> (called KOGs). More
organisms
will be added in the future.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/sutils/genom_table.cgi">BLAST against Microbial
Genomes</a> - sequences from selected completed and unfinished eukaryotic and
prokaryotic genomes; partial genomic sequences have been graciously provided by
the
sequencing centers or extracted from GenBank. NCBI encourages sequencing
centers to
submit partially sequenced genomes to be included in this BLAST page. Data can
be
submitted via ftp, after contacting genomes@ncbi.nlm.nih.gov to set up an
account.</td>
</tr>
<tr>
<td CLASS="TEXT">FTP - download complete bacterial genomes in a variety of
formats,
including GenBank flat file (*.gbk), GenBank summary file (*.gbs), FASTA Nucleic
Acid file (*.fna), FASTA Amino Acid file (*.faa), Protein Table (*.ptt), and
others.
<ul>
<li><a href="ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/">complete
bacterial
genomes as submitted to <b>GenBank</b></a>.
<li><a href="ftp://ftp.ncbi.nih.gov/genomes/Bacteria/">complete bacterial
genomes
from the <b>RefSeq</b> database</a> (described <a href="#RefSeq">above</a>).
</ul>
<blockquote>(See additional note in the FTP section, <a
href="#FTP_OtherGenomes">below</a>, about the two different FTP directories)
</blockquote>
</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: VIRUSES======== -->
<a NAME="ViralGenomes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Viral Genomes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="ViralReferenceSequences"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/VIRUSES/viruses.html">Viruses Home Page</a>
provides brief background information on the biology of viruses, links to
viral genome sequences in Entrez Genome (described below), and a wide range of
related resources. It also includes information about <a
href="/genomes/VIRUSES/viroabout.html">Viral Reference
Sequences</a>, a collection of reference sequences for more than 1000 viral
genomes.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/static/vis.html">Entrez Genome</a> -
Graphical
representation of complete viral genomes that can be viewed in their entirety or
explored in progressively greater detail; links to associated sequence data. A
summary of Coding Regions (described <a href="#ProtTaxTable">above</a>) is
provided
for each virus. Additional information about Entrez Genome is also provided <a
href="#EntrezGenome">above</a>.</td>
</tr>
</table>
<a NAME="FluVirus"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/FLU/FLU.html">Influenza Virus Resource</a> -
A collection of
resources specifically designed to support the research on the flu virus.
Includes
links to genome sequence data, analytical tools, epidemiological information,
and the
<a href="http://www.nih.gov/news/pr/nov2004/niaid-15.htm">Influenza Genome
Sequencing Project</a>, funded by the National Institute of Allergy and
Infectious Diseases (<a href="http://www.niaid.nih.gov/">NIAID</a>).
</td>
</tr>
</table>
<a NAME="Retroviruses"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/retroviruses/">Retrovirus Resources</a> - A
collection of
resources specifically designed to support the research of retroviruses.
Resources
include a genotyping tool that uses the BLAST algorithm to identify the genotype
of
a query sequence; an alignment tool for global alignment of multiple sequences;
an
HIV-1 automatic sequence annotation tool; and annotated maps of 16 retroviruses
viewable in GenBank, FASTA, and graphic formats, with links to associated
sequence
records.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/RefSeq/HIVInteractions/">HIV Interactions</a> - The
HIV-1, Human Protein Interaction Database contains information about known
interactions of HIV-1 proteins with proteins from human hosts. It provides
annotated bibliograhies of published reports of protein interactions, with links
to
the corresponding PubMed records and sequence data. <a
href="#HIVInteractions">More
information</a> about this database is provided under "Literature Databases".
</td>
</tr>
</table>
<a NAME="PASC"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/pasc/viridty.cgi?textpage=overview">PASC (PAirwise Sequence Comparison)</a> - a web tool for analysis of pairwise identity distribution within viral families. The identities are pre-computed for every pair within the families and with distribution plotted in a form of histogram where each bar corresponds to an interval of identities. Only complete genomes should be used as query sequences. The results from partial sequences are not suitable for the purpose of this tool. After you submit your sequence, PASC will start computing pairwise identities between the external genome and the existing genome sequences of the family. At the end of the process, you will be presented with the list of 15 closest matches to the genome within the family. The <a href="/sutils/pasc/viridty.cgi?textpage=documentation">documentation</a> provides more details about using PASC.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: VIROIDS======== -->
<a NAME="ViroidGenomes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Viroid Genomes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/static/vid.html">Entrez Genome</a> -
Graphical
representation of complete viroid genomes with links to corresponding sequence records. Additional information about Entrez Genome is also provided <a
href="#EntrezGenome">above</a>.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: PLASMIDS======== -->
<a NAME="Plasmids"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Plasmids</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/static/o.html">Entrez Genome</a> - Graphical
representation of complete plasmids that can be viewed in their entirety or
explored
in progressively greater detail; links to associated sequence data. A summary
of
Coding Regions (described <a href="#ProtTaxTable">above</a>) is provided for
each
plasmid. Additional information about Entrez Genome is also provided <a
href="#EntrezGenome">above</a>.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN GENOMES_AND_MAPS: EUKARYOTIC_ORGANELLES======== -->
<a NAME="EukaryoticOrganelles"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Eukaryotic
Organelles</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/genomes/ORGANELLES/organelles.html">Eukaryotic
Organelles
Home Page</a> - Provides an overview of eukaryotic organelles; a description of
the Organelle Reference Sequences
project (part of RefSeq, see <a href="#RefSeq">above</a>); and links to (a)
lists of
completely sequenced organelles shown in taxonomic hierarchy and alphabetically
by
organism, (b) gene and RNA order in metazoan mitochondria, and (c) related web
sites.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/genomes/static/euk_o.html">Entrez Genome</a> -
Graphical
representation of complete eukaryotic organelles that can be viewed in their
entirety or explored in progressively greater detail; links to associated
sequence
data. A summary of Coding Regions (described <a href="#ProtTaxTable">above</a>)
is
provided for each organelle. Additional information about Entrez Genome is also
provided <a href="#EntrezGenome">above</a>.</td>
</tr>
</table>
<p></p>
<!-- =========================END_GENOMES_AND_MAPS==================== -->
<!-- ===========================TOOLS============================== -->
<a NAME="Tools"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" CLASS="H3a">Tools</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="../Tools/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<table BORDER="0" WIDTH="98%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#Entrez">Text Term Searching (Entrez)</a>, &nbsp;
<a href="#BLAST">Sequence Similarity Searching (BLAST)</a>, &nbsp;
<a href="#NucleotideSequenceAnalysis">Nucleotide Sequence Analysis</a>, &nbsp;
<a href="#ProteinSequenceAnalysis">Protein Sequence Analysis and Proteomics</a>,
&nbsp;
<a href="#StructureTools">3-D Structure Display and Similarity Searching</a>,
&nbsp;
<a href="#GenomeAnalysisTools">Genome Analysis</a>, &nbsp;
<a href="#GeneExpressionTools">Gene Expression</a>
</td>
<td CLASS="TEXT" WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<br>
<!-- ========CATEGORY WITHIN TOOLS: TEXT SEARCHING======== -->
<a NAME="Entrez"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Data Retrieval - Text Term
Searching</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Entrez/">Entrez</a> - provides integrated access to
nucleotide and protein sequence data from >160,000 organisms, along with 3D
protein
structures, genomic mapping information, PubMed MEDLINE, and more.Sequence data
are
combined from various sources, including GenBank, EMBL, DDBJ, RefSeq,
PIR-International, PRF, Swiss-Prot, and PDB. A <a
href="/Database/datamodel/index.html"><b>Data Model</b></a> provides a schematic
illustration of the connections between the many data types in Entrez.
<ul>
<li><b>Two unique features</b> of Entrez are:</li>
<ol>
<li><b>pre-computed similarity searches</b> for each database record,
identifying the <b>related records ("neighbors")</b> within that database. The
algorithm used to identify related records depends upon the database.</li>
<li><b>links</b> from a record in one database to associated records in the
other Entrez databases, providing <b>integrated access across the various
databases</b>. For exmaple, if a MEDLINE record cites a GenBank nucleotide
sequence
record, which in turn is linked to a protein translation, there will be a link
between those three records. The <a
href="/Database/datamodel/index.html"><b>Entrez
Data Model</b></a> illustrates the links that exist among the various Entrez
Databases.</li>
</ol>
<li><b>Entrez can be searched</b> with a wide variety of text terms such as
author
name, journal name, gene or protein name, organism, unique identifier (e.g.,
accession number, sequence ID, PubMed ID), and other terms, depending on the
database being searched.</li>
<li>The <a href="/entrez/query/static/help/helpdoc.html"><b>help
document</b></a>
provides more information about the databases available in Entrez as well as
search
tips. External resources can be linked to Entrez records using the new
<b>Linkout</b> service (described <a href="#LinkOut">below</a>). Entrez also
allows
users to store search strategies and select a customized subset of LinkOut links
through the NCBI <b>My NCBI</b> service (described <a
href="#MyNCBI">below</a>).</li>
</ul>
</td>
</tr>
</table>
<a NAME="AdvancedEntrez"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><b><a href="/entrez/query/static/advancedentrez.html">Advanced
Entrez Tools</a></b> - A number of additional Entrez tools provide advanced
searching capabilities (including Batch Entrez, described below); the ability to
set
preferences, store search strategies, and select resources for LinkOut
(described <a
href="#LinkOut">above</a>) display; and the ability to access Entrez data
through
programming tools (including Entrez Utilities, described below).</td>
</tr>
</table -->
<a NAME="BatchEntrez"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><b>Batch Entrez</b> - allows you to retrieve a large number of
<a
href="/entrez/batchentrez.cgi?db=Nucleotide">nucleotide sequences</a> or <a
href="/entrez/batchentrez.cgi?db=Protein">protein sequences</a> from Entrez, in
a
batch mode, by importing a file containing a list of the desired <a
href="samplerecord.html#GInB">GI</a> or <a
href="samplerecord.html#AccessionB">accession numbers</a>. Search results are
saved
directly to a local disk file on your computer.</td>
</tr>
</table>
<a NAME="EntrezUtilities"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query/static/eutils_help.html">Entrez
Utilities</a> - Entrez Programming Utilities, also called E-Utilities, are tools
that provide access to Entrez data outside of the regular web query interface.
They
represent a method of <a href="/entrez/query/static/linking.html">making WWW
links
to Entrez</a>.
Each utility performs a specialized retrieval task, and can be used simply by
writing a specially formatted URL. For example, EFetch retrieves records in the
requested format from a list of one or more primary IDs or from the user's
environment. The E-Utilities web page describes the available utilities and
links
to a brief help document for each one. E-Utilities can be helpful for
retrieving
search results for future use in another environment. To receive announcements
about about Entrez Utilities, see the <a href="Summary/email_lists.html">NCBI
Email
Lists</a> page.</td>
</tr>
</table>
<a NAME="LinkOut"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/linkout/doc/linkoutoverview.html">LinkOut</a>
- a
registry service to create links from specific articles, journals, or biological
data in Entrez (described <a href="#Entrez">above</a>) to resources on external
web
sites. Third parties can provide a URL, resource name, brief description of
their
web site, and specification of the NCBI data from which they would like to
establish
links. The specification can be written as a valid Boolean query to Entrez, or
as a
list of identifiers for specific articles or sequences. Entrez PubMed users can
then
select which external links are visible in their searches, through the NCBI
<b>My NCBI</b> service</a> (described <a href="#MyNCBI">below</a>). &nbsp;&nbsp;To
receive announcements about updates and new features in LinkOut, see the <a
href="Summary/email_lists.html">NCBI Announcements Email Lists</a> page.</td>
</tr>
</table>
<a NAME="MyNCBI"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/cubby.fcgi?call=QueryExt.CubbyQuery..Show
All">My NCBI</a> -
Formerly known as "Cubby", My NCBI allows Entrez users to store and update
searches, receive automatic e-mails of search updates, select the Filter folder
tabs shown by default for any Entrez database, and customize their LinkOut
(described <a href="#LinkOut">above</a>) display to include or exclude links to
providers. My NCBI requires that your system accepts <a
href="/entrez/query/static/faq.html#Acceptscookies">cookies</a>. You must also
complete a brief registration form in which you select a username and password.
You will need those in order to access your "My NCBI" account. There is also an
option to remain logged into My NCBI, if desired. For additional information, see
the <a
href="http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helppubmed.section.pubmedhelp.My_NCBI">help document</a> and <a
href="http://www.nlm.nih.gov/pubs/techbull/jf05/jf05_myncbi.html">tutorial</a>.</td>
</tr>
</table>
<a NAME="QueryEmailServer"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="../Genbank/GenBankEmail.html">Query E-mail Server</a>
-
The Query server, which provided e-mail access to a subset of Entrez databases,
was
<b>discontinued on April 15, 2002</b> because of limited usage. Almost all
Entrez
searchers now use the WWW Entrez interface, described <a
href="#Entrez">above</a>.
It provides access to more databases and more features than are possible through
the
e-mail interface.</td>
</tr>
<!-- tr>
<td CLASS="TEXT"><a href="/Entrez/Network/nentrez.overview.html">Network
Entrez</a>
- a TCP/IP-based client-server version of WWW Entrez. Makes a direct connection
with
the NCBI databases over the Internet to retrieve data. The data comes in a
binary
form taking up less network bandwidth for transmittance. Client software is
available for PC, Mac, and Unix.</td>
</tr -->
</table>
<a NAME="IRX"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/irx/dbST/dbest_query.html">dbEST</a>, <a
href="/dbST/dbgss_query.html">dbGSS</a>, <a
href="/dbST/dbsts_query.html">dbSTS</a>
search pages - EST, GSS, and STS sequences are available from two sources: the
EST/GSS/STS divisions of GenBank (via Entrez), and separate but related
databases
called dbEST/dbGSS/dbSTS. The sequences and accession numbers in both sources
are
the same but the record formats differ. The search bar at the top of the dbEST
and
dbGSS home pages searches the GenBank EST and GSS divisions through Entrez. The
records are displayed in dbEST/dbGSS format by default, and can be displayed in
GenBank flat file format, if desired.</td>
</tr>
</table -->
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query/static/overview.html#Citation
Matcher">Citation Matcher</a> - allows you to find the PubMed ID of any article
in
the PubMed database, given its bibliographic information (journal, volume, page,
etc.). </td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/query/static/citmatch.html">Citation
Matcher for single articles</a></i></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/entrez/getids.cgi">Batch Citation Matcher for
many articles</a></i></ul></td>
</tr>
<!-- tr>
<td CLASS="TEXT"><ul><li>E-Mail Citation Matcher is also available, and can be
used
for or one or many articles. To obtain the help documentation, send the word
HELP
in the body of a message to the server address: <a
href="mailto:citation_matcher@ncbi.nlm.nih.gov">citation_matcher@ncbi.nlm.nih.go
v</a
></i></ul></td>
</tr -->
</table>
<p></p>
<!-- ========CATEGORY WITHIN TOOLS: SEQUENCE SIMILARITY SEARCHING======== -->
<a NAME="BLAST"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Sequence Similarity Searching</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<br>
<!-- ======== Top of Border around BLAST home page categories ======== -->
<table BGCOLOR="#006699" BORDER="0" CELLSPACING=0 CELLPADDING=3 WIDTH="90%">
<tr>
<td CLASS="TEXT">
<!-- ======== Top of Border around BLAST home page categories ======== -->
<table BGCOLOR="#FFFFFF" BORDER="0" CELLSPACING=0 CELLPADDING=5 WIDTH="100%">
<tr>
<td CLASS="TEXT" BGCOLOR="#FFFFFF"><a href="/BLAST/">BLAST Home Page</a> -
provides access to BLAST (Basic Local Alignment Search Tool) programs, overview,
help documentation, FAQs</a>. &nbsp;&nbsp;BLAST programs include:
<ul>
<li>Nucleotide BLAST
<ul>
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&HITLIST_SIZE=100&NCB
I_GI
=on&PAGE=Nucleotides&PROGRAM=blastn&SERVICE=plain&SET_DEFAULTS.x=34&SET_DEFAULTS
.y=8
&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">Standard nucleotide-nucleotide BLAST
[blastn]</a> (more about <a href="#BLAST2.x">BLAST 2.x</a> and <a
href="/blast/html/BLASThomehelp.html#NTBLAST">nucleotide BLAST</a>)
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&HITLIST_SIZE=100&NCB
I_GI
=on&PAGE=MegaBlast&SERVICE=plain&SET_DEFAULTS.x=34&SET_DEFAULTS.y=8&SHOW_OVERVIE
W=on
&END_OF_HTTPGET=Yes">MegaBLAST</a> (described <a href="#MegaBLAST">below</a>)
<ul>
<li><a href="/blast/mmtrace.html">MegaBLAST against the Trace
Archives</a>
(described <a href="#MegaBLAST">below</a>)
<li><a href="/BLAST/tracemb.shtml">Discontiguous MegaBLAST against the
Trace
Archives</a> (described <a href="#MegaBLAST">below</a>)
</ul>
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=1000&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&HITLIST_SIZE=100&NCBI_GI=on
&PAG
E=Nucleotides&PROGRAM=blastn&SERVICE=plain&SET_DEFAULTS.x=29&SET_DEFAULTS.y=6&SH
OW_O
VERVIEW=on&WORD_SIZE=7&END_OF_HTTPGET=Yes">Search for short nearly exact
matches</a>
(<a href="/blast/html/BLASThomehelp.html#NTBLAST">more...</a>)
</ul>
<li>Protein BLAST
<ul>
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CDD_SEARCH=on&CLIENT=web&COMPOSITION_BASED_STATISTICS=on
&DAT
ABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(none)&EXPECT=10&FILTER=L&FORMAT_OBJECT=A
lign
ment&FORMAT_TYPE=HTML&I_THRESH=0.005&MATRIX_NAME=BLOSUM62&NCBI_GI=on&PAGE=Protei
ns&P
ROGRAM=blastp&SERVICE=plain&SET_DEFAULTS.x=41&SET_DEFAULTS.y=5&SHOW_OVERVIEW=on&
END_
OF_HTTPGET=Yes">Standard protein-protein BLAST [blastp]</a> (more about <a
href="#BLAST2.x">BLAST 2.x</a> and <a
href="/blast/html/BLASThomehelp.html#AABLAST">protein BLAST</a>)
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=250
&ALIGNMENT_VIEW=Pairwise&CLIENT=web&COMPOSITION_BASED_STATISTICS=on&DATABASE=nr&
CDD_
SEARCH=on&DESCRIPTIONS=100&ENTREZ_QUERY=(none)&EXPECT=10&FORMAT_OBJECT=Alignment
&FOR
MAT_TYPE=HTML&HITLIST_SIZE=100&I_THRESH=0.005&MATRIX_NAME=BLOSUM62&NCBI_GI=on&PA
GE=P
roteins&PROGRAM=blastp&RUN_PSIBLAST=on&SERVICE=plain&SET_DEFAULTS.x=36&SET_DEFAU
LTS.
y=5&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">PSI-BLAST and PHI-BLAST</a> (described
<a
href="#PHI-BLAST">below</a>)
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=20000&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&GAPCOSTS=9+1&HITLIST_SIZE=
100&
I_THRESH=0.005&MATRIX_NAME=PAM30&NCBI_GI=on&PAGE=Proteins&PROGRAM=blastp&SERVICE
=pla
in&SET_DEFAULTS.x=24&SET_DEFAULTS.y=10&SHOW_OVERVIEW=on&WORD_SIZE=2&END_OF_HTTPG
ET=Y
es">Search for short nearly exact matches</a> (<a
href="/blast/html/BLASThomehelp.html#AABLAST">more...</a>)
</ul>
<li>Translated BLAST Searches
<ul>
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&GENETIC_CODE=1&HITLI
ST_S
IZE=100&NCBI_GI=on&PAGE=Translations&PROGRAM=blastx&SERVICE=plain&SET_DEFAULTS.x
=37&
SET_DEFAULTS.y=5&SHOW_OVERVIEW=on&UNGAPPED_ALIGNMENT=no&END_OF_HTTPGET=Yes">Nucl
eoti
de query - Protein db [blastx]</a> (<a
href="/blast/html/BLASThomehelp.html#TRBLAST">more...</a>)
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&GENETIC_CODE=0&HITLI
ST_S
IZE=100&NCBI_GI=on&PAGE=Translations&PROGRAM=tblastn&SERVICE=plain&SET_DEFAULTS.
x=23
&SET_DEFAULTS.y=10&SHOW_OVERVIEW=on&UNGAPPED_ALIGNMENT=no&END_OF_HTTPGET=Yes">Pr
otei
n query - Translated nucleotide db [tblastn]</a> (<a
href="/blast/html/BLASThomehelp.html#TRBLAST">more...</a>)
<li><a
href="/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=50&
ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRIPTIONS=100&ENTREZ_QUERY=(no
ne)&
EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&GENETIC_CODE=1&HITLI
ST_S
IZE=100&NCBI_GI=on&PAGE=Translations&PROGRAM=tblastx&SERVICE=plain&SET_DEFAULTS.
x=21
&SET_DEFAULTS.y=9&SHOW_OVERVIEW=on&UNGAPPED_ALIGNMENT=yes&END_OF_HTTPGET=Yes">Nu
cleo
tide query - Translated nucleotide db [tblastx]</a> (<a
href="/blast/html/BLASThomehelp.html#TRBLAST">more...</a>)
</ul>
<li>Search for conserved domains
<ul>
<li><a href="/Structure/cdd/wrpsb.cgi">Search the Conserved Domain Database
using RPS-BLAST</a> (see CD-Search, described <a href="#CD-Search">below</a>)
<li><a href="/Structure/lexington/lexington.cgi?cmd=rps">Search by domain
architecture [CDART]</a> (described <a href="#CDART">below</a>)
</ul>
<li>Pairwise BLAST
<ul>
<li><a href="/blast/bl2seq/bl2.html">BLAST 2 Sequences</a> (described <a
href="#BLAST2Sequences">below</a>)
</ul>
<li>Genomic BLAST pages
<ul>
<li>Mammals</li>
<ul>
<li><a href="/genome/seq/HsBlast.html">Human Genome</a> (additional
information <a href="#HumanGenomeBLAST">above</a>)</li>
<li><a href="/genome/seq/MmBlast.html">Mouse Genome</a> (additional
information <a href="#MouseGenomeBLAST">above</a>)</li>
<li><a href="/genome/seq/RnBlast.html">Rat Genome</a> (additional
information <a href="#RatGenomeBLAST">above</a>)</li>
<li><a href="/genome/seq/BtaBlast.html">Cow Genome</a> (additional
information <a href="#RatGenomeBLAST">above</a>)</li>
<li><a href="/genome/seq/SscBlast.html">Pig Genome</a> (additional
information <a href="#RatGenomeBLAST">above</a>)</li>
<li><a href="/genome/seq/CfaBlast.html">Dog Genome</a> (additional
information <a href="#RatGenomeBLAST">above</a>)</li>
</ul>
<li>Other Vertebrates</li>
<ul>
<li><a href="/genome/seq/DrBlast.html">Zebrafish Genome</a></li>
<li><a href="/BLAST/Genome/fugu.html"><i>Fugu rubripes</i>
Genome</a></li>
</ul>
<li>Invertebrates</li>
<ul>
<li><a href="/BLAST/Genome/Insects.html"><i>Drosophila
melanogaster</i></a></li>
<li><a href="/BLAST/Genome/Insects.html"><i>Anopheles
gambiae</i></a></li>
</ul>
<li>Nematodes</li>
<ul>
<li><a href="/BLAST/Genome/NematodeBlast.html"><i>Caenorhabditis
elegans</i></a></li>
</ul>
<li><a href="/BLAST/Genome/PlantBlast.shtml">Plants</a></li>
<li><a href="/BLAST/Genome/FungiBlast.html">Fungi</a></li>
<!-- ul>
<li><a href="/BLAST/Genome/FungiBlast.html"><i>Saccharomyces
cerevisiae</i></a></li>
<li><a href="/BLAST/Genome/FungiBlast.html"><i>Schizoaccharomyces
pombe</i></a></li>
<li><a href="/BLAST/Genome/FungiBlast.html"><i>Neurospora
crassa</i></a></li>
<li><a href="/BLAST/Genome/FungiBlast.html"><i>Magnaporthe
grisea</i></a></li>
</ul -->
<li>Protozoa</li>
<ul>
<li><a href="/BLAST/Genome/plasmodium.html"><i>Plasmodium
falciparum</i></a>
(malaria)</li>
</ul>
<li><a href="/sutils/genom_tree.cgi?organism=euk">Other Eukaryotic
Genomes</a></li>
<li><a href="/sutils/genom_table.cgi">Bacteria</a></li>
<li><a href="/ORGANELLES/mblast.cgi?gene=COX1&tax=33208">Organelles</a></li>
<li><a
href="/GENOMES/Bitor.cgi?db=VOG&data=vog&gdata=dsdna.defl">Viruses</a></li>
</ul>
<li>Specialized BLAST pages
<ul>
<li><a href="/SNP/snpblastByChr.html">BLAST against dbSNP</a> (additional
information about dbSNP is <a href="#dbSNP">above</a>)
<li><a href="/igblast/">IgBLAST</a> - Analysis of immunoglobulin sequences
in
GenBank (described <a href="#IgBLAST">below</a>)
<li><a href="/VecScreen/VecScreen.html">VecScreen</a> - BLAST-based
detection of
vector contamination (described <a href="#VecScreen">below</a>)
</ul>
<li>Retrieve results for an existing Request ID (RID)
<ul>
<li><a
href="/blast/Blast.cgi?LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&CMD=Web&PAGE=Forma
ting
&NCBI_GI=yes&SHOW_OVERVIEW=on">Retrieve results for an existing Request ID</a>
for
up to 24 hours after receiving the RID. (<a
href="/blast/html/BLASThomehelp.html#EXISTINGRID">more...</a>)
</ul>
<li>JavaScript free BLAST pages
<ul>
<li><a href="/blast/index.nojs.cgi">Get the BLAST home page with JavaScript
free
links</a>
</ul>
</ul>
</td>
</tr>
</table>
<!-- ======== Bottom of Border around BLAST home page categories ======== -->
</td>
</tr>
</table>
<br>
<!-- ======= END Bottom of Border around BLAST home page categories ======= -->
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/blastannounce.html">BLAST Announcements</a> -
To
receive announcements about updates and new features, and advance notices about
upcoming changes in the NCBI BLAST service, see the <a
href="Summary/email_lists.html">NCBI Announcements Email Lists</a> page.</td>
</tr>
</table>
<a NAME="BLAST2.x">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/qblast.html">BLAST 2.x</a> - A version of BLAST
(<a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_
uids
=9254694&dopt=Abstract">Altschul, et al., 1997</a>) that permits gaps in the
alignments it produces. Assessments of statistical significance are based upon
prior simulations using random sequences. (<a
href="/blast/html/BLASThomehelp.html#NTBLAST">more...</a>)</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">QBLAST - A queuing system that allows users to retrieve Gapped
BLAST results at their convenience and format their results multiple times with
different formatting options. This system also allows the NCBI to more
efficiently
use computational resources, better serving the community. As of Fall 1999, the
QBLAST system is used for all BLAST searches. (<a
href="/blast/html/BLASThomehelp.html#NTBLAST">more...</a>)</td>
</tr>
</table>
<a NAME="MegaBLAST">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO
_FOR
MAT=Semiauto&ALIGNMENTS=50&ALIGNMENT_VIEW=Pairwise&CLIENT=web&DATABASE=nr&DESCRI
PTIO
NS=100&ENTREZ_QUERY=(none)&EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYP
E=HT
ML&HITLIST_SIZE=100&NCBI_GI=on&PAGE=MegaBlast&SERVICE=plain&SET_DEFAULTS.x=34&SE
T_DE
FAULTS.y=8&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">MegaBLAST</a> - permits
searching
with batches of ESTs or with large cDNA or genomic sequences. (<a
href="/blast/html/BLASThomehelp.html#NTBLAST">more...</a>)</td>
</tr>
</table>
<a NAME="TraceBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><ul><li><a
href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Nucleotides&PROGRAM=blastn&BLAST_SPEC=TraceArchive&BLAST_PROGRAMS=megaBlast&PAGE_TYPE=BlastSearch">BLAST
against the
Trace
Archives</a> - compare nucleotide sequence data against the raw data underlying
all
of the sequence generated by various genome projects. Additional information
about
the Trace Archive is <a href="#TraceArchive">above</a>.</li></ul></td>
</tr>
</table>
<a NAME="PHI-BLAST">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://blast.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=250
&ALIGNMENT_VIEW=Pairwise&CLIENT=web&COMPOSITION_BASED_STATISTICS=on&DATABASE=nr&
CDD_
SEARCH=on&DESCRIPTIONS=100&ENTREZ_QUERY=(none)&EXPECT=10&FORMAT_OBJECT=Alignment
&FOR
MAT_TYPE=HTML&HITLIST_SIZE=100&I_THRESH=0.005&MATRIX_NAME=BLOSUM62&NCBI_GI=on&PA
GE=P
roteins&PROGRAM=blastp&RUN_PSIBLAST=on&SERVICE=plain&SET_DEFAULTS.x=36&SET_DEFAU
LTS.
y=5&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">PHI-BLAST</a> - Pattern Hit Initiated
BLAST
(<a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_
uids
=9705509&dopt=Abstract">Zhang, et al., 1998</a>) - A program to search a protein
database using a protein query, seeking only alignments that preserve a
specified
pattern contained within the query. (<a
href="/blast/html/BLASThomehelp.html#AABLAST">more...</a>)</td>
</tr>
</table>
<a NAME="PSI-BLAST">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://blast.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS
=250
&ALIGNMENT_VIEW=Pairwise&CLIENT=web&COMPOSITION_BASED_STATISTICS=on&DATABASE=nr&
CDD_
SEARCH=on&DESCRIPTIONS=100&ENTREZ_QUERY=(none)&EXPECT=10&FORMAT_OBJECT=Alignment
&FOR
MAT_TYPE=HTML&HITLIST_SIZE=100&I_THRESH=0.005&MATRIX_NAME=BLOSUM62&NCBI_GI=on&PA
GE=P
roteins&PROGRAM=blastp&RUN_PSIBLAST=on&SERVICE=plain&SET_DEFAULTS.x=36&SET_DEFAU
LTS.
y=5&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes">PSI-BLAST</a> - Position-Specific
Iterated
BLAST (<a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_
uids
=9254694&dopt=Abstract">Altschul, et al., 1997</a>) - A program for searching
protein databases using protein queries, in order to find other members of the
same
protein family. All statistically significant alignments found by BLAST are
combined into a multiple alignment, from which a position-specific score matrix
is
constructed. This matrix is used to search the database for additional
significant
alignments, and the process may be iterated until no new alignments are found.
(<a
href="/blast/html/BLASThomehelp.html#AABLAST">more...</a>)</td>
</tr>
</table>
<a NAME="RPS-BLAST">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/cdd/wrpsb.cgi">RPS-BLAST</a> - Reverse
Position-Specific BLAST - A program used to identify conserved
domains in a protein query sequence. It does this by comparing a query
protein sequence to position-specific score matrices that have been prepared
from conserved domain alignments. The service is accessible through
<a href="/Structure/cdd/wrpsb.cgi"><b>Conserved Domain Search
(CD-Search)</b></a>,
described <a href="#CD-Search">below</a>. A <a
href="/blast/documents/README.rps">readme</a> file provides additional detail
about
the RPS-BLAST program.
<br>
<blockquote><i><b>Note:</b></i> RPS-BLAST is a "reverse" version of
position-specific iterated BLAST (PSI-BLAST), described above. Both RPS-BLAST
and
PSI-BLAST use multiple alignments and position-specific score matrices
(PSSMs) to derive conserved features of a protein family. However,
RPS-BLAST compares a query sequence against a database of profiles prepared
from ready-made alignments, while PSI-BLAST builds alignments starting
from a single protein sequence. The programs also differ in purpose:
RPS-BLAST is used to identify conserved domains in a query sequence,
while PSI-BLAST is used to identify other members of the protein family
to which a query sequence belongs.<blockquote></td>
</tr>
</table>
<a NAME="TaxonomyBLAST">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://blast.ncbi.nlm.nih.gov/blast/taxblasthelp.html">Taxonomy BLAST</a> - an
implementation of Gapped BLAST (2.x) that groups hits by source organism,
according
to information in NCBI's Taxonomy database. Species are listed in order of
sequence
similarity to the query sequence; the strongest match listed first. Three report
views are available:
<ul>
<li><i>organism report</i> - sorts the BLAST hits according to species, so that
all
of the hits to the same organism will appear together
<li><i>lineage report</i> - gives a simplified view of the relationships between
the
organisms, according to their classification in the taxonomy database. This
report
is "focused" on the organism which yielded the strongest BLAST hit. It answers
the
question, "how closely are the organisms in the BLAST hit list related to the
query
sequence according to the taxonomy database?"
<li><i>taxonomy report</i> - provides a more detailed report about the
relationships among all of the organisms found in the BLAST hit list, including
a
summary of the taxa that are represented, the number of species and subspecies,
and
the number of BLAST hits at each node in the taxonomic hierarchy.
</ul>
</td>
</tr>
</table>
<a NAME="PrimerBLAST"></a>
<table border="0" cellspacin="5" width="90%" bgcolor="FFFFFF">
<tr>
<td class="text"><a href="/tools/primer-blast/index.cgi?LINK_LOC=BlastHome">Primer-BLAST</a> - find primers
specific to a PCR template.</td>
<a NAME="BLAST2Sequences"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&PROG_DEF=blastn&BLAST_PROG_DEF=megaBlast&SHOW_DEFAULTS=on&BLAST_SPEC=blast2seq&LINK_LOC=align2seq">BLAST
2 Sequences</a> - A BLAST-based
tool
for aligning two nucleotide or protein sequences, producing a pairwise DNA-DNA
or
protein-protein sequence comparison. (<a
href="/blast/html/BLASThomehelp.html#BLAST2SEQ">more...</a>)</td>
</tr>
</table>
<a NAME="IgBLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/igblast/">IgBLAST</a> - IgBLAST was developed to
facilitate analysis of immunoglobulin sequences in GenBank. It allows
blastp or blastn searches of either the nr database or a special database
of Immunoglobulin (Ig) germline V (variable region) genes. Searches may
be limited to either human or mouse genes. IgBLAST performs three main
functions: (1) reports the variable, D, or J regions that most closely
match the query sequence; (2) annotates the immunoglobulin domains
(FWR1 through FWR3) according to Kabat et al.; and (3) for searches
against the nucleotide nr or protein nr database, simplifies the
process of identifying related sequences by matching the IgBLAST hits
to the closest germline V genes. (<a
href="/blast/html/BLASThomehelp.html#SPECBLAST">more...</a>)</td>
</tr>
</table>
<a NAME="BLink"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/static/blinkhelp.html">BLink</a> - BLink
("BLAST
Link") displays the results of BLAST searches that have been done for every
protein
sequence in the Entrez Proteins data domain. <b>To access it</b>, follow the
Blink
link displayed beside any hit in the results of an Entrez Proteins search. In
contrast to Entrez's "Related Sequences" feature, which lists the titles of
similar
sequences, BLink displays the graphical output of pre-computed blastp results
against the protein non-redundant (nr) database. The output includes the
positions
of up to 200 BLAST hits on the query sequence, scores, and alignments. (View <a
href="http://www.ncbi.nlm.nih.gov/sutils/blink.cgi?pid=4557757">sample BLink
output
for human MLH1 protein</a>.) BLink offers a variety of display options,
including
the distribution of hits by taxonomic grouping, the best hit to each organism,
the
protein domains in the query sequence, similar sequences that have known 3-D
structures, and more.
Additional options allow you to specify which taxa you would like to exclude,
increase or decrease the BLAST cutoff score, or filter the BLAST hits to show
only
those from a specific source database, such as RefSeq or Swiss-Prot. &nbsp;See
the
<a href="/sutils/static/blinkhelp.html">BLink help document</a> for additional
information.</td>
</tr>
</table>
<a NAME="BLASTEmailServer"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Genbank/GenBankEmail.html">BLAST E-mail server</a> -
an
e-mail-based sequence similarity search service; this was <b>discontinued on
June
17, 2002</b> because of limited usage. Most BLAST searchers are now done
through <a
href="/BLAST/">BLAST web page.</a></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/blast/blastcl3/">Network
BLAST</a>
- a TCP/IP-based client-server version of WWW Entrez. Makes a direct connection
with
the NCBI databases over the Internet to retrieve data. No web browser is
required.
Client software is available for PC, Mac, and Unix on the FTP site at
ftp://ftp.ncbi.nih.gov/blast/blastcl3/</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/blast/executables/">Stand-alone
BLAST</a> - download BLAST executables for local use from
ftp://ftp.ncbi.nih.gov/blast/executables/. Binaries are provided for IRIX 6.2,
Solaris 2.6, DEC OSF1 (ver. 4.0d), LINUX, and Win32 systems. Please read the <a
href="/blast/executables/README.bls">README file</a> in the ftp directory for
more
information. <a href="ftp://ftp.ncbi.nih.gov/blast/db/">BLAST databases</a> also
available for downloading. There is also some information on setting up
Standalone
BLAST at the NHGRI site at <a
href="http://genome.nhgri.nih.gov/blastall/blast_install/">http://genome.nhgri.n
ih.g
ov/blastall/blast_install</a>.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN TOOLS: NUCLEOTIDE SEQUENCE ANALYSIS======== -->
<a NAME="NucleotideSequenceAnalysis"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Nucleotide Sequence Analysis</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/">BLAST</a> - see sequence similarity
searching, <a
href="#BLAST">above</a>, for a complete list of BLAST programs.</td>
</tr>
</table>
<a NAME="ePCR"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/e-pcr/">e-PCR - Electronic PCR</a> - compare a
query sequence to a database of mapped sequence-tagged sites (STSs) to find a
possible map location for the query sequence, or compare a query STS to a database
of nucleotide sequences to identify the sequences that contain the STS.
<!-- E-PCR finds STSs in DNA sequences by searching
for
subsequences that closely match the PCR primers present in mapped markers. The
subsequences must have the correct order, orientation, and spacing that they
could
plausibly prime the amplification of a PCR product of the correct molecular
weight.
&nbsp;e-PCR searches against data in NCBI's <b>UniSTS</b>, <a
href="#UniSTS">described</a> in the Molecular Databases/Nucleotide Sequences
section of this guide. (The <a href="/STS/">original version</a> of
e-PCR
searches only against dbSTS.) --> &nbsp;e-PCR can be used on the WWW, or the
software can be downloaded from the <a
href="ftp://ftp.ncbi.nih.gov/pub/schuler/e-PCR/">/pub/schuler/e-PCR
directory</a> of the NCBI ftp site. Additional information is provided by <a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Ab
stract&list_uids=9149949">Schuler, G.D.</a> There are two versions of e-PCR:
<ul>
<li><a href="http://www.ncbi.nlm.nih.gov/sutils/e-pcr/forward.cgi">Forward
e-PCR</a> - Search STS database with a query sequence. Electronic PCR (e-PCR) is
computational procedure that is used to identify sequence tagged sites(STSs),
within DNA sequences. e-PCR looks for potential STSs in DNA sequences by searching
for subsequences that closely match the PCR primers and have the correct order,
orientation, and spacing that could represent the PCR primers used to generate
known STSs.</li>
<li><a href="http://www.ncbi.nlm.nih.gov/sutils/e-pcr/reverse.cgi">Reverse
e-PCR</a> - Search sequence database with STS. The main motivation for
implementing reverse searching (called Reverse e-PCR) was to make it feasible to
search the human genome sequence and other large genomes. The new version of e-PCR
provides a search mode using a query sequence against a sequence database.</li>
</ul>
</td>
</tr>
</table>
<a NAME="EntrezGeneNucSeqAnalysis"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=gene">Entrez Gene</a> - as <a
href="#EntrezGene">described</a> in the Molecular Databases/Genes section of this
guide, each Entrez Gene record encapsulates a wide range of information for a
given gene and organism. When possible, the information includes results of
analyses that have been done on the sequence data. The amount and type of
information presented depend on what is available for a particular gene and
organism and can include:
(1) graphic summary of the genomic context, intron/exon structure, and
flanking genes,
(2) link to a graphic view of the mRNA sequence, which in turn shows
biological features such as CDS, SNPs, etc.,
(3) links to gene ontology and phenotypic information,
(4) links to corresponding protein sequence data and conserved domains,
(5) links to related resources, such as mutation databases.
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/projects/Malaria/">Malaria Genetics and Genomics</a>
-
provides data and information relevant to malaria genetics and genomics.
Resources
include organism specific sequence BLAST databases (<i>Plasmodium falciparum</i>
only, all <i>Plasmodium</i>, and all <i>Toxoplasma</i>). More about the Malaria
genome resources <a href="#MalariaGenome">below</a>.</td>
</tr>
</table>
<a NAME="ModelMaker"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/static/ModelMakerHelp.html">Model Maker</a> -
allows you to view the evidence (mRNAs, ESTs, and gene predictions) that was
aligned
to assembled genomic sequence in order to build a gene model, and to edit the
model
by selecting or removing putative exons. You can then view the mRNA sequence
and
potential ORFs for the edited model, and save the mRNA sequence data for use in
other programs. Model Maker is accessible from sequence maps that were analyzed
at
NCBI and displayed in <b>Map Viewer</b> (described <a
href="#MapViewer">above</a>).
&nbsp;To see an <b>example</b>, follow the <b>"mm" link</b> beside any gene
annotated on the <a
href="http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?org=hum&chr=1&maps=ideogr,loc
&ver
bose=on">human "Gene_Sequence" map</a> in the Map Viewer. (More info about
human
data in Map Viewer is given <a href="#MapViewerHuman">above</a>.)</td>
</tr>
</table>
<a NAME="ORFFinder"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/gorf/gorf.html">ORF Finder</a> - graphical analysis
tool
which finds all open reading frames of a selected minimum size in a user's
sequence
or in a sequence already in the database. Designed for prokaryotic sequences.
Identifies all open reading frames using the standard or alternative genetic
codes.
The deduced amino acid sequence can be saved in various formats and searched
against
the sequence database using the WWW BLAST server. The ORF Finder is also
packaged
with the <a href="/Sequin/">Sequin</a> sequence submission software. The <a
href="ftp://ftp.ncbi.nih.gov/pub/tatiana/orf/">stand alone program</a> can be
downloaded from NCBI ftp site.</td>
</tr>
</table>
<a NAME="ProtEST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/ProtEST/">ProtEST</a> - a tool that presents
a graphical view of matches between nucleotide sequences in UniGene and possible
translational products. To generate the alignments, the 6-frame translations
of mRNA and EST sequences in UniGene are compared to protein sequences using BLASTX
with -e 1e-6. The translated nucleotide sequences are compared with proteins
from a number of model organisms and the best match in each organism is recorded.
ProtEST links are displayed in UniGene (<a href="#UniGene">description</a>)
reports in the section on model organism protein similarities.
<!-- [Text from 03/2003 version of "Genomic Biology Resources" fact sheet:
Pre-computed BLAST alignments between protein sequences from eight model organisms,
including H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elgans, S. cerevisiae,
A. thaliana, and E. coli, and the six-frame translations of UniGene nucleotide sequences.
ProtEST links are displayed in UniGene reports in the section on model organism protein
similarities.] --></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/pasc/viridty.cgi?textpage=overview">PASC (PAirwise Sequence Comparison)</a> - a web tool for analysis of pairwise identity distribution within <b>viral families</b>. The identities are pre-computed for every pair within the families and with distribution plotted in a form of histogram where each bar corresponds to an interval of identities. Only complete genomes should be used as query sequences. The results from partial sequences are not suitable for the purpose of this tool. After you submit your sequence, PASC will start computing pairwise identities between the external genome and the existing genome sequences of the family. At the end of the process, you will be presented with the list of 15 closest matches to the genome within the family. The <a href="/sutils/pasc/viridty.cgi?textpage=documentation">documentation</a> provides more details about using PASC.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/retroviruses/">Retroviruses Resources</a> - A
collection
of resources specifically designed to support the research of retroviruses.
Resources include a genotyping tool that uses the BLAST algorithm to identify
the
genotype of a query sequence; an alignment tool for global alignment of multiple
sequences; an HIV-1 automatic sequence annotation tool; and annotated maps of 16
retroviruses viewable in GenBank, FASTA, and graphic formats, with links to
associated sequence records.</td>
</tr>
</table>
<a NAME="SAGEmap"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/SAGE/">SAGEmap</a> - SAGEmap provides a tool for
performing statistical tests designed specifically for differential-type
analyses of
SAGE (Serial Analysis of Gene Expression) data. The data include SAGE libraries
generated by individual labs as well as those generated by the Cancer Genome
Anatomy
Project (CGAP, described <a href="#CGAP">above</a>), which have been submitted
to
Gene Expression Omnibus (GEO, described <a href="#GEO">above</a>). Gene
expression
profiles that compare the expression in different SAGE libraries are also
available
on the Entrez GEO Profiles pages. It is possible to enter a query sequence in
the
SAGEmap resource to determine what SAGE tags are in the sequence, then map to
associated SAGEtag records and view the expression of those tags in different
CGAP
SAGE libraries.
</td>
</tr>
</table>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Sequin/index.html">Sequin</a> - A submission tool
that
includes <a href="#ORFFinder">ORF Finder</a>, an alignment viewer/editor, and a
link
to <a href="#Entrez">Entrez</a>. More information about Sequin is <a
href="#Sequin">above</a>.</td>
</tr>
</table -->
<a NAME="Spidey"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/IEB/Research/Ostell/Spidey/">Spidey</a> -
mRNA-to-genomic
alignment program that was designed to find good alignments regardless of intron
size, and to avoid getting confused by nearby pseudogenes and paralogs. It uses
a
combination of alignment algorithms and heuristics to construct its models.
Spidey
has been optimized for both intraspecies and interspecies alignments. (See <a href="/IEB/Research/Ostell/Spidey/spideydoc.html">Spidey documentation</a> for more information.)</td>
</tr>
</table>
<a NAME="Splign"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/splign/">Splign</a> -
a utility for computing cDNA-to-Genomic, or spliced sequence alignments.
It is based on a variation of the Needleman<61>Wunsch global alignment algorithm and
specifically accounts for introns and splice signals. It is due to this algorithm that
Splign is accurate in determining splice sites and tolerant to sequencing errors.
Splign also uses BLAST hits to identify possible locations of genes and their
duplications on genomic sequences and to speed up the core dynamic programming.
(See <a href="/sutils/splign/splign.cgi?textpage=documentation">Splign documentation</a> for more information.)</td>
</tr>
</table>
<a NAME="UniGeneDDD"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/info_ddd.shtml">UniGene DDD</a> - Digital
Differential Display - an online tool to compare computed gene expression
profiles
between selected cDNA libraries. Using a statistical test, genes whose
expression
levels differ significantly from one tissue to the next are identified and shown
to
the user. <a href="#UniGene">Additional information</a> about UniGene is in the
<a
href="#Genes">Molecular Databases/Genes</a> section.</td>
</tr>
</table>
<a NAME="VecScreen"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/VecScreen/VecScreen.html">VecScreen</a> - a tool for
identifying segments of a nucleic acid sequence that may be of vector, linker or
adapter origin prior to sequence analysis or submission. VecScreen was
developed to
combat the problem of vector contamination in public sequence databases. It is
also
useful to run a new sequence through VecScreen before performing any kind of
analysis on the sequence, since the presence of vector sequences can lead to
misleading BLAST hits, etc. VecScreen compares a query sequence against the
<b>UniVec</b> database, described <a href="#UniVec">above</a>.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN TOOLS: PROTEIN SEQUENCE ANALYSIS======== -->
<a NAME="ProteinSequenceAnalysis"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Protein Sequence Analysis and
Proteomics</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/BLAST/">BLAST</a> - see sequence similarity
searching, <a
href="#BLAST">above</a>, for a complete list of BLAST programs.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/static/blinkhelp.html">BLink</a> - BLink
("BLAST
Link") displays the results of BLAST searches that have been done for every
protein
sequence in the Entrez Proteins data domain. <b>To access it</b>, follow the
BLink
link displayed beside any hit in the results of an Entrez Proteins search. In
contrast to Entrez's "Related Sequences" feature, which lists the titles of
similar
sequences, BLink displays the graphical output of pre-computed blastp results
against the protein non-redundant (nr) database. The output includes the
positions
of up to 200 BLAST hits on the query sequence, scores, and alignments. (View <a
href="http://www.ncbi.nlm.nih.gov/sutils/blink.cgi?pid=4557757">sample BLink
output
for human MLH1 protein</a>.) BLink offers a variety of display options,
including
the distribution of hits by taxonomic grouping, the best hit to each organism,
the
protein domains in the query sequence, similar sequences that have known 3-D
structures, and more.
Additional options allow you to specify which taxa you would like to exclude,
increase or decrease the BLAST cutoff score, or filter the BLAST hits to show
only
those from a specific source database, such as RefSeq or Swiss-Prot. &nbsp;See
the
<a href="/sutils/static/blinkhelp.html">BLink help document</a> for additional
information.</td>
</tr>
</table>
<a NAME="CD-Search"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/cdd/cdd.shtml">CD-Search</a> -
The Conserved Domain Search Service (CD-Search) can be used to identify
the conserved domains present in a protein sequence. CD-Search
uses RPS-BLAST (described <a href="#RPS-BLAST">above</a>) to compare
a query sequence against position-specific score matrices that
have been prepared from conserved domain alignments present in
the Conserved Domain Database (CDD) (described <a href="#CDD">above</a>).
Hits can be displayed as a pairwise alignment of the query sequence
with a representative domain sequence, or as a multiple alignment.
Alignments are also mapped to known 3-dimensional structures,
and can be displayed using Cn3D (described <a href="#Cn3D">above</a>).
In the Cn3D display, residues in sequence alignments are variously colored,
based on their degree of conservation. (<a
href="/blast/html/BLASThomehelp.html#CDSEARCH">more...</a>)</td>
</tr>
</table>
<a NAME="Cognitor"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/COG/old/xognitor.html">COGnitor</a> - compare your
sequence to the COGs database (described <a href="#COGs">above</a>) to identify
the
cluster of orthologous groups to which it belongs. A stand-alone <a
href="ftp://ftp.ncbi.nih.gov/pub/tatusov/dignitor/">dignitor</a> program is also
available. It runs cognitor in batch mode, comparing a large group of proteins
to
the COGs database, and can be downloaded from the ftp site.</td>
</tr>
</table>
<a NAME="CDART"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/lexington/lexington.cgi?cmd=rps">Conserved
Domain Architecture Retrieval Tool (CDART)</a> - When given a protein query
sequence, CDART displays the functional domains that make up the protein and
lists
proteins with similar domain architectures. The functional domains for a
sequence
are found by comparing the protein sequence to a database of conserved domain
alignments, CDD (described <a href="#CDD">above</a>), using RPS-BLAST (described
<a
href="#RPS-BLAST">below</a>).
</td>
</tr>
</table>
<a NAME="OMSSA"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://pubchem.ncbi.nlm.nih.gov/omssa">Open Mass
Spectrometry Search Algorithm (OMSSA)</a> - a public search service that allows
proteomics researchers to submit the mass spectra of peptides and proteins for
identification. OMSSA then compares these mass spectra to theoretical ions
generated from databases of known protein sequences and then ranks the results
using a score derived from classical hypothesis testing. References available
from the OMSSA home page describe the OMSSA algorithm and its validation.
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/ProtEST/">ProtEST</a> - a tool that presents
a graphical view of matches between nucleotide sequences in UniGene and possible
translational products. To generate the alignments, the 6-frame translations
of mRNA and EST sequences in UniGene are compared to protein sequences using BLASTX
with -e 1e-6. The translated nucleotide sequences are compared with proteins
from a number of model organisms and the best match in each organism is recorded.
ProtEST links are displayed in UniGene (<a href="#UniGene">description</a>)
reports in the section on model organism protein similarities.
<!-- [Text from 03/2003 version of "Genomic Biology Resources" fact sheet:
Pre-computed BLAST alignments between protein sequences from eight model organisms,
including H. sapiens, M. musculus, R. norvegicus, D. melanogaster, C. elgans, S. cerevisiae,
A. thaliana, and E. coli, and the six-frame translations of UniGene nucleotide sequences.
ProtEST links are displayed in UniGene reports in the section on model organism protein
similarities.] --></td>
</tr>
</table>
<a NAME="TaxPlot">
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/taxik2.cgi">TaxPlot</a> - a tool for 3-way
comparisons of genomes on the basis of the protein sequences they encode. To use
TaxPlot, one selects a reference genome to which two other genomes are compared.
Pre-computed BLAST results are then used to plot a point for each predicted
protein
in the reference genome, based on the best alignment with proteins in each of
the
two genomes being compared.</td>
</tr>
</table>
<br>
<!-- CATEGORY WITHIN TOOLS: 3-D STRUCTURE DISPLAY AND SIMILARITY SEARCHING -->
<a NAME="StructureTools"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">3-D Structure Display and
Similarity
Searching</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<a NAME="Cn3D"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/CN3D/cn3d.shtml">Cn3D</a> - "See in 3-D," a
structure and sequence alignment viewer for NCBI databases. It allows viewing
of
3-D structures and sequence-structure or structure-structure alignments. Cn3D
can
work as a helper application to your browser, or as a client-server application
that
retrieves structure records from MMDB (described <a href="#MMDB">above</a>)
directly
over the internet. The <a href="/Structure/CN3D/cn3d.shtml">Cn3D home page</a>
provides access to information on how to <a
href="/Structure/CN3D/cn3dinstall.shtml">install</a> the program, a <a
href="/Structure/CN3D/cn3dtut.shtml">tutorial</a> to get started, and a
comprehensive <a href="/Structure/CN3D/cn3dhelp.shtml">help document</a>.</td>
</tr>
</table>
<a NAME="VAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/VAST/vast.html">VAST</a> - Vector Alignment
Search Tool - a computer algorithm developed at NCBI and used to identify
similar
protein 3-dimensional structures. The "structure neighbors" for every structure
in
MMDB are pre-computed and accessible via links on the MMDB Structure Summary
pages.
These neighbors can be used to identify distant homologs that cannot be
recognized
by sequence comparison alone.</td>
</tr>
</table>
<a NAME="VASTSearch"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Entrez/">VAST search</a> - - structure-structure
similarity search service. Compares 3D coordinates of a newly determined
protein
structure to those in the MMDB/PDB database. VAST Search computes a list of
structure neighbors that you may browse interactively, viewing superpositions
and
alignments by molecular graphics.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/cdd/wrpsb.cgi">CD-Search</a> -
The Conserved Domain Search Service (CD-Search) can be used to identify
the conserved domains present in a protein sequence. CD-Search
uses RPS-BLAST (described <a href="#RPS-BLAST">above</a>) to compare
a query sequence against position-specific score matrices that
have been prepared from conserved domain alignments present in
the Conserved Domain Database (CDD) (described <a href="#CDD">above</a>).
Hits can be displayed as a pairwise alignment of the query sequence
with a representative domain sequence, or as a multiple alignment.
Alignments are also mapped to known 3-dimensional structures,
and can be displayed using Cn3D (described <a href="#Cn3D">above</a>).
In the Cn3D display, residues in sequence alignments are variously colored,
based on their degree of conservation.</td>
</tr>
</table>
<a NAME="Threading"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Structure/RESEARCH/threading.shtml">Threading</a> -
As
part of NCBI's Computational Biology Branch (described <a
href="#CBB">above</a>),
the Structure group, led by Dr. Steve Bryant, conducts research in protein
threading. Protein threading predicts the three-dimensional structure of a
protein
sequence by threading it through known structures and calculating its energy.
The
experimental software developed by the NCBI Structure group is available on the
<a
href="ftp://ftp.ncbi.nih.gov/pub/pkb/">FTP</a> site. A <a
href="ftp://ftp.ncbi.nih.gov/pub/pkb/README">readme</a> file provides more
information as well as references.</td>
</tr>
</table>
<br>
<!-- ========CATEGORY WITHIN TOOLS: GENOME ANALYSIS======== -->
<a NAME="GenomeAnalysisTools"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Genome Analysis Tools</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=Genome">Entrez Genome</a> -
whole
genomes of over 1000 organisms. The genomes represent both completely sequenced
organisms and those for which sequencing is in progress. All three main domains
of
life - <a
HREF="/genomes/static/eub_g.html">bacteria</a>,
<a
HREF="/genomes/static/a_g.html">archaea,</a>
and <a
HREF="/genomes/static/euk_g.html">eukaryota</a>
- are represented, as well as many <a
HREF="/genomes/VIRUSES/viruses.html">viruses</a>,
<a
HREF="/genomes/static/phg.html">phages</a>,
<a
HREF="/genomes/static/vid.html">viroids</a>,
<a
HREF="/genomes/static/o.html">plasmids</a>,
and <a
HREF="/genomes/ORGANELLES/organelles.html">organelles.</a>. Entrez Genome
provides
graphical overviews of complete genomes/chromosomes, and the ability to explore
regions of interest in progressively greater detail. <a
href="#ProtTaxTable">ProtTables and TaxTables</a> are provided for organisms on
which analyses have been done by NCBI staff.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/mapview/">Map Viewer</a> - shows integrated views of
chromosome maps for many organisms. Used to view the NCBI assembly of complete
genomes, including human, Map Viewer is a valuable tool for the identification
and
localization of genes, particularly those that contribute to diseases. <a
href="#MapViewer">Additional information</a> about Map Viewer is provided in the
<a
href="#Genomes">Genomes and Maps</a> section of this guide.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sky/">SKY/M-FISH & CGH Database</a> - The NCI and
NCBI
SKY/M-FISH and CGH Database is a repository of publicly submitted data from
Spectral
Karyotyping (SKY), Multiplex Fluorescence In Situ Hybridization (M-FISH), and
Comparative Genomic Hybridization (CGH),
which are complementary fluorescent molecular cytogenetic techniques.
SKY/M-FISH permits the simultaneous visualization of each human
or mouse chromosome in a different color, facilitating the identification of
chromosomal aberrations; CGH can
be used to generate a map of DNA copy number changes in tumor genomes.
Collaborative
project with the National Cancer Institute. &nbsp;(<a
href="/sky/show_html_frag.cgi?filename=protocol.html_frag&header=Instructions+fo
r+SKY+or+M-FISH+and+CGH+data+submission&title=Instructions+for+SKY+or+M-FISH+and
+CGH+data+submission">data
submission instructions...</a>)</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/sutils/pasc/viridty.cgi?textpage=overview">PASC (PAirwise Sequence Comparison)</a> - a web tool for analysis of pairwise identity distribution within <b>viral families</b>. The identities are pre-computed for every pair within the families and with distribution plotted in a form of histogram where each bar corresponds to an interval of identities. Only complete genomes should be used as query sequences. The results from partial sequences are not suitable for the purpose of this tool. After you submit your sequence, PASC will start computing pairwise identities between the external genome and the existing genome sequences of the family. At the end of the process, you will be presented with the list of 15 closest matches to the genome within the family. The <a href="/sutils/pasc/viridty.cgi?textpage=documentation">documentation</a> provides more details about using PASC.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/retroviruses/">Retrovirus Resources</a> - A
collection of resources specifically designed to support the research of retroviruses.
Resources include a genotyping tool that uses the BLAST algorithm to identify the genotype
of a query sequence; an alignment tool for global alignment of multiple sequences;
an HIV-1 automatic sequence annotation tool; and annotated maps of 16 retroviruses
viewable in GenBank, FASTA, and graphic formats, with links to associated sequence
records.</td>
</tr>
</table>
<p></p>
<!-- ========CATEGORY WITHIN TOOLS: GENE EXPRESSION======== -->
<a NAME="GeneExpressionTools"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Gene Expression Tools</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/geo/">Gene Expression Omnibus (GEO)</a> - provides
several tools to assist with the visualization and exploration of GEO data.
Datasets may be viewed as hierarchical cluster heat maps, providing insight into
the
relationships between samples and co-regulated genes. Individual gene
expression
profiles showing significant differences between experimental subsets may be
located
using average subset rank value comparisons. Related gene expression profiles
may
be identified on the basis of sequence similarity, profile similarity, or
homology.
Indicators of dataset normalization quality are provided as distribution graphs,
and
by flagging outliers. Links to other
NCBI sequence, mapping and publication database resources are provided where
possible. (<a href="#GEO">More information</a> about GEO is provided in the
Molecular Databases/Gene Expression section of this file.)<br></td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/SAGE/">SAGEmap</a> - SAGEmap provides a tool for
performing statistical tests designed specifically for differential-type
analyses of
SAGE (Serial Analysis of Gene Expression) data. The data include SAGE libraries
generated by individual labs as well as those generated by the Cancer Genome
Anatomy
Project (CGAP, described <a href="#CGAP">above</a>), which have been submitted
to
Gene Expression Omnibus (GEO, described <a href="#GEO">above</a>). Gene
expression
profiles that compare the expression in different SAGE libraries are available
on
the <a href="/entrez/query.fcgi?db=geo">Entrez GEO Profiles</a> pages. It is
also
possible to enter a query sequence in the <a href="/SAGE/">SAGEmap</a> resource
to
determine what SAGE tags are in the sequence, then map to associated SAGEtag
records
and view the expression of those tags in different CGAP SAGE libraries. (<a
href="#SAGEmap">More information</a> about SAGEmap is provided in the Molecular
Databases/Gene Expression section of this file.)
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/CGAP/">Cancer Genome Anatomy Project (CGAP)</a> - an
interdisciplinary program to identify the human genes expressed in different
cancerous states, based on cDNA (EST) libraries, and to determine the molecular
profiles of normal, precancerous, and malignant cells. CGAP is a collaboration
among the National Cancer Institute, the NCBI, and numerous research labs.
(Related
resources are listed under <a href="#CancerResearch">human genome/cancer
research</a>.) The following tools are provided by the National Cancer
Institute
(NCI) through their CGAP web page:
<ul>
<li><a href="http://cgap.nci.nih.gov/Tissues/LibrarySummarizer">Gene Library
Summarizer (GLS)</a> - Tool that finds all the genes in a specific cDNA library
or
group of libraries. <!-- Tool that searches for cDNA libraries by gene and
returns
library name and expression level. --></li>
<li><a href="http://cgap.nci.nih.gov/Tissues/xProfiler">cDNA Expression Profiler
(xProfiler)</a> - an online tool to compare computed gene expression profiles
between selected cDNA libraries. </li>
<li><a href="http://cgap.nci.nih.gov/Tissues/GXS">Differential Gene Expression
Displayer (DGED) </a> - distinguishes statistical differences in gene expression
between two pools of libraries. </li>
<!-- li><a href="/SAGE/">NCBI's Serial Analysis of Gene Expression Map </a> - An
online tool to compare computed gene expression profiles between selected SAGE
(Serial Analysis of Gene Expression) libraries. Also includes a comprehensive
analysis of SAGE tags in human GenBank records, in which a UniGene identifier is
assigned to each human sequence that contains a SAGE tag. (See additional
information about SAGEmap, <a href="#SAGEmap">below</a>.)</li -->
</ul>
</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/UniGene/info_ddd.shtml">UniGene DDD</a> - Digital
Differential Display - an online tool to compare computed gene expression
profiles
between selected cDNA libraries. Using a statistical test, genes whose
expression
levels differ significantly from one tissue to the next are identified and shown
to
the user. <a href="#UniGene">Additional information</a> about UniGene is in the
<a
href="#Genes">Molecular Databases/Genes</a> section.</td>
</tr>
</table>
<p></p>
<!-- =============================END_TOOLS=========================== -->
<!-- ========================= RESEARCH ========================= -->
<a NAME="Research"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" class="H3a">Research at NCBI</td>
<td WIDTH="13%" BGCOLOR="#6699CC" class="H4a"><a
href="/CBBresearch/">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<a NAME="CBB"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/CBBresearch/">Computational Biology Branch Home
Page</a>
- Overview of the research program in the Computational Biology Branch (CBB) of
NCBI
and a list of Senior Investigators. The research programs focus on theoretical,
analytical, and applied approaches to a broad range of fundamental problems in
molecular biology, including biomolecular structures, genome analysis, theory of
sequence analysis, hardware design, software and database design, and text
retrieval
and document analysis.</td>
</tr>
</table>
<a NAME="ResearchProjects"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://www.ncbi.nlm.nih.gov/Web/Research/proj.html">Research Projects</a>
-
List of research projects pertaining to biomolecular structures, genome
analysis,
theory of sequence analysis, hardware design, software and database design, and
text
retrieval and document analysis. Links are also provided to a staff
bibliography
and the full text of selected publications.</td>
</tr>
</table -->
<a NAME="SeniorInvestigatorsInPubMed"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/CBBresearch/senior.html">Senior Investigators in
PubMed</a> - publications written by senior investigators in the NCBI
Computational
Biology Branch and represented in the <a href="#PubMed">PubMed</a> database.
The
PubMed records include links to publisher web sites and/or full text articles
when
available.</td>
</tr>
</table>
<a NAME="SeminarSchedule"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/CBBresearch/Seminar/">Seminar Schedule</a> - Seminars
held at NCBI on a wide range of molecular biology and mathematical topics.
These
seminars are open to the NIH community and the general public, and are presented
by
NCBI staff as well as visiting scientists.</td>
</tr>
</table>
<a NAME="PostdoctoralFellows"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="Summary/postdoc.html">Postdoctoral Fellowships</a> -
general information, application procedure</td>
</tr>
</table>
<a NAME="StaffBibliography"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://www.ncbi.nlm.nih.gov/Web/Research/Bib/index.html">Staff
Bibliography</a> - A list of published and in-press papers and monographic
chapters
written by current and former NCBI staff. Includes links to corresponding
PubMed
records. Selected papers can also be seen on the full-text page, below.</td>
</tr>
</table -->
<a NAME="StaffFullText"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://www.ncbi.nlm.nih.gov/Web/Research/Papers/index.html">Full Text of
Selected Staff Publications</a> - full text of selected published and in-press
papers and monographic chapters written by current and former NCBI staff.</td>
</tr>
</table -->
<br>
<!-- ====================== END RESEARCH ========================= -->
<!-- ==================== SOFTWARE ENGINEERING ====================== -->
<a NAME="SoftwareEngineering"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" CLASS="H3a">SoftwareEngineering</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a href="/IEB/">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<a NAME="IEB"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/IEB/">Information Engineering Branch Home Page</a> -
Overview of the functions of the Information Engineering Branch (IEB) of NCBI,
which
is responsible for designing and building NCBI's production software and
databases.
</tr>
</table>
<a NAME="ToolBox"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/IEB/ToolBox/">NCBI ToolBox</a> - Supported software
tools
from IEB. Describes the three components of the ToolBox: data model, data
encoding,
and programming libraries. Provides access to documentation for the data model,
C
toolkit, C++ toolkit, NCBI Toolkit Source Browser, XML demo program, XML DTDs,
and
the <a href="ftp://ftp.ncbi.nih.gov/toolbox/">FTP site</a>. Additional
information
about the FTP site is provided <a href="#FTP_ToolBox">below</a>.</tr>
</table>
<a NAME="IEB_Research"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/IEB/Research/">R&D Projects</a> - The IEB Research
and
Development Area is a place for IEB projects and datasets which may never become
fully supported NCBI resources. This includes early prototypes of software,
results
of early or one-off analyses, tools that a fully functional but not integrated
into
the main, public NCBI systems, or datasets that may have some value but do not
fit
well into the main NCBI pages. </tr>
</table>
<a NAME="ASN.1"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="Summary/asn1.html">ASN.1</a> - The software in the <a
href="/IEB/ToolBox/">NCBI ToolBox</a> is primarily designed to read Abstract
Syntax
Notation 1 (ASN.1) format records, an International Standards Organization (ISO)
data representation format. The readme files in the <a
href="ftp://ftp.ncbi.nih.gov/toolbox/">toolbox</a> and <a
href="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/">toolbox/ncbi_tools</a>
directories
of the FTP site contain more information about the toolbox and ASN.1. An <a
href="Summary/asn1.html">ASN.1 summary</a> is also available. The ToolBox can
produce data as either ASN.1, as before, or as XML (<a
href="/IEB/ToolBox/XML/ncbixml.txt">more about XML</a>). Additional information
about the ToolBox, documentation, and demo programs are available on the <a
href="/IEB/ToolBox/">NCBI ToolBox</a> page.</tr>
</table>
<a NAME="IEB_FTP"></a>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/toolbox/">FTP NCBI Software
ToolBox</a> - set of software and data exchange specifications used by NCBI to
produce portable, modular software for molecular biology. The software in the
Toolbox is primarily designed to read ASN.1 format records. It is available to
the
public in the toolbox directory of NCBI's ftp site, and can be used in its own
right
or as a foundation for building tools with similar properties. The readme files
in
the <a href="ftp://ftp.ncbi.nih.gov/toolbox/">toolbox</a> and <a
href="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/">toolbox/ncbi_tools</a>
directories
contain more information about the toolbox and ASN.1. An <a
href="Summary/asn1.html">ASN.1 summary</a> is also available. The ToolBox can
produce data as either ASN.1, as before, or as XML (<a
href="/IEB/ToolBox/XML/ncbixml.txt">more about XML</a>). Additional information
about the ToolBox, documentation, and demo programs are available on the <a
href="/IEB/ToolBox/">NCBI ToolBox home page</a>. </tr>
</table -->
<p></p>
<!-- ===================== END SOFTWARE ENGINEERING ================ -->
<!-- ================ EDUCATION ========================= -->
<p>
<a NAME="Education"></a>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" CLASS="H3a">Education</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/Education/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="95%" BGCOLOR="#FFFFFF">
<blockquote>
<a href="#News">News</a>, &nbsp;
<a href="#EducationBooks">Books</a>, &nbsp;
<a href="#Glossaries">Glossaries</a>, &nbsp;
<a href="#Tutorials">Tutorials</a>, &nbsp;
<a href="#Courses">Courses</a>, &nbsp;
<a href="#AdditionalResources">Additional Resources</a>
</td>
<td CLASS="TEXT" WIDTH="5%" BGCOLOR="#FFFFFF">&nbsp;</td>
</tr>
</table>
<br>
<!-- ======== CATEGORY WITHIN EDUCATION: NEWS ======== -->
<a NAME="News"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">News - keeping up with
the
changes at NCBI</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<!-- table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/Education/BLASTinfo/milestones.html">Bioinformatics</a> -
A brief introduction to some milestones in bioinformatics.</td>
</tr>
</table -->
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=newsncbi">NCBI News</a> -
announcements about new resources, enhancements to existing resources, staff publications,
tutorials, FAQs.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/whatsnew.html">What's New</a> - recently
released
resources and enhancements to existing resources</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="Summary/email_lists.html">NCBI Announcements Email
Lists</a> -
Receive announcements about changes and updates to a variety of NCBI services.
In addition to a general NCBI-announce list, topic-specific e-mail lists are
available for BLAST, GenBank, dbSNP, Genomes, LinkOut, RefSeq, Sequin, and Entrez
Utilities (for making WWW Links to Entrez). Information on <a
href="Summary/email_lists.html">how to subscribe</a> is provided.
</td>
</tr>
</table>
<p></p>
<!-- ======== CATEGORY WITHIN EDUCATION: BOOKS ======== -->
<a NAME="EducationBooks"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Books</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?call=bv.View..ShowTOC&rid=coffeebrk.TOC&depth=1">Coffee
Break</a> - a collection of short reports on recent biological discoveries. Each
report incorporates interactive tutorials that show how bioinformatics tools are
used as a part of the research process.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?rid=gnd">Genes and Disease</a> - introduction to the
relationship between genetic factors and human disease. Summary information for
~60 genetic diseases with links to related databases and organizations.</td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?call=bv.View..ShowTOC&rid=handbook.TOC&depth=2">NCBI
Handbook</a> - an online book, written by NCBI staff, that discusses
the many resources available at NCBI. Each chapter is
devoted to one service; after a brief overview on using
the resource, there is an account of how the resource works,
including topics such as how data are included in a database,
database design, query processing, and how the different
resources relate to each other.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/entrez/query.fcgi?db=books">Entrez Books</a> - In
collaboration with book publishers, the NCBI is adapting textbooks for the web
and
linking them to PubMed, the biomedical bibliographic database. The idea is to
provide background information to PubMed, so that users can explore unfamiliar
concepts found in PubMed search results.</td>
</tr>
</table>
<p></p>
<!-- ======== CATEGORY WITHIN EDUCATION: GLOSSARIES ======== -->
<a NAME="Glossaries"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Glossaries</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?call=bv.View..ShowSection&rid=handbook.glossary.1237">NCBI
Handbook Glossary</a> - part of the NCBI Handbook, described <a
href="#NCBIHandbook">above</a>. Includes a variety of terms pertaining to
biological data and bioinformatics.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/Class/FieldGuide/glossary.html">FieldGuide
Glossary</a> -
developed for the Field Guide course described <a
href="#Courses">below</a>.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/genome/glossary.htm">Genome Glossary</a> -
commonly used genome terms; includes links to associated literature for each
term.</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="/genome/guide/build.html#glossary">Human Genome
Build
Glossary</a> - accompanies the document that describes the <a
href="/genome/guide/build.html">NCBI Genomic Sequence Assembly and Annotation
Process</a>.<ul></td>
</tr>
<!-- tr>
<td CLASS="TEXT"><a href="/genome/guide/mouse/glossary.htm">Mouse Genome Build
Glossary</a> - accompanies the <a href="/genome/guide/mouse/mm_build.html">NCBI
Mouse Contig Assembly and Annotation Process</a>.</td>
</tr -->
<tr>
<td CLASS="TEXT"><a href="http://www.genome.gov/page.cfm?pageID=10002096">NHGRI
Talking Glossary of Genetic Terms</a> - by the National Human Genome Research
Institute (NHGRI).</td>
</tr>
</table>
<p></p>
<!-- ======== CATEGORY WITHIN EDUCATION: TUTORIALS ======== -->
<a NAME="Tutorials"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td CLASS="TEXT" WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Tutorials</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/bookshelf/br.fcgi?book=comgen&part=blast">BLAST QuickStart: Example-Driven Web-Based
BLAST Tutorial</a> - a tutorial based on the former NCBI minicourse, "BLAST Quick Start", within the
Comparative Genomics online book.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/bookshelf/br.fcgi?book=comgen&part=psibl">PSI-BLAST Tutorial</a> - a chapter
within the Comparative Genomics online book.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/bookshelf/br.fcgi?book=comgen&part=gene">Identification of Disease Genes:
Example-Driven Web-Based Tutorial</a> - a tutorial based on the former NCBI minicourse, "Identification of
Disease Genes", within the Comparative Genomics online book.</td>
</tr>
</table>
<a NAME="SciencePrimer"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/About/primer/index.html">Science Primer</a> - The
science
behind our resources. An introduction for researchers, educators and the public.
Provides a plain language introductions to bioinformatics, genome mapping,
molecular
modeling, SNPs, ESTs, microarray technology, molecular genetics,
pharmacogenomics,
and phylogenetics.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="http://www.nlm.nih.gov/bsd/pubmed_tutorial/m1001.html">PubMed Tutorial</a>
-
comprehensive instruction on using PubMed's various features <!--and<br> <a
href="/Literature/pubmed_search.html">PubMed Tour</a> (brief introduction
illustrating how to do a simple search) --></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/Entrez/tutor.html">Entrez Tutorial</a> - show users
how
to make use of the full power of the Entrez data retrieval system. Using a human
gene as an example, it demonstrates the variety of information that can be
gathered
for a single gene across a number of Entrez databases.</td>
</tr>
<!-- tr>
<td CLASS="TEXT"><a href="/Database/tut1.html">Entrez Nucleotides
Tutorial</a></td>
</tr -->
<!-- tr>
<td CLASS="TEXT"><a href="/Literature/omim_search.html">OMIM Tutorial</a></td>
</tr -->
<tr>
<td CLASS="TEXT"><a href="/BLAST/tutorial/Altschul-1.html">BLAST
Statistics</a></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/Structure/CN3D/cn3dtut.shtml">3-D Protein Structure
Tutorial: Cn3D structure viewing program</a><SPACER TYPE=vertical
SIZE="5"></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="/books/bv.fcgi?rid=handbook.chapter.ch24">Map Viewer
Exercises</a> - a chapter within the <a
href="/books/bv.fcgi?call=bv.View..ShowSection&rid=handbook">NCBI Handbook</a>
(described <a href="#NCBIHandbook">above</a>).<SPACER TYPE=vertical
SIZE="5"></td>
</tr>
</table>
<a NAME="CoffeeBreak"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a
href="/books/bv.fcgi?call=bv.View..ShowSection&rid=coffeebrk">Coffee
Break</a> - a collection of short reports on recent biological discoveries. Each
report incorporates interactive tutorials that show how bioinformatics tools are
used as a part of the research process.</td>
</tr>
</table>
<p></p>
<!-- ======== CATEGORY WITHIN EDUCATION: COURSES ======== -->
<a NAME="Courses"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Courses</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="/Education/index.html">Education</a> - for information on past NCBI courses, please see the
Education home page.
</td>
</tr>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><FONT color="336699">Getting Started with Linkout</FONT> -
LinkOut is a feature of PubMed that provides users with links from PubMed
and other Entrez databases to a wide variety of relevant web-accessible
online resources, including full-text publications, biological databases,
consumer health information, research tools, and more. The goal is to
facilitate access to relevant online resources beyond the Entrez system to
extend, clarify, or supplement information found in the Entrez database.
This hands-on class is designed to introduce students to LinkOut and provide
step-by-step instruction on activating LinkOut for print and electronic
journal collections, allowing users to see their own library's holdings and
access electronic full-text through the PubMed interface. Topics covered are
registration for LinkOut, entering holdings, displaying a library's icon for
"branding" purposes, and access to free full-text through LinkOut.
Getting Started with LinkOut is a free class and is awarded 4 MLA continuing
education credits. For more information and to register, visit the NLM's National
Training Center and Clearinghouse (NTCC) website:
<a href="http://nnlm.gov/mar/online/">http://nnlm.gov/mar/online/</a>.
Questions about the class can be sent to <a
href="mailto:lib-linkout@ncbi.nlm.nih.gov">lib-linkout@ncbi.nlm.nih.gov</a></td>
</tr>
</table>
<!-- tr>
<td CLASS="TEXT"><a href="http://bimas.cit.nih.gov/linkage/index.html">Genetic
Linkage Analysis</a></td>
</tr -->
<!-- tr>
<td CLASS="TEXT"><a href="http://www.nhgri.nih.gov/COURSE2000/">NHGRI Current
Topics
in Genome Analysis</a> - <a href="http://www.nhgri.nih.gov/COURSE2002/">Fall
2002</a> (handouts available on web) <!-- a
href="http://www.nhgri.nih.gov/COURSE99/">Spring 1999</a> (lectures and handouts
available on web) --></td>
</tr -->
<!-- tr>
<td CLASS="TEXT">Smithsonian Institution and NHGRI Campus on the Mall (Winter
2001)
presents: "<a href="http://www.nhgri.nih.gov/CONF/SI/">The Human Genome Project:
From Maps to Medicine</a>"</td>
</tr -->
</table>
<p></p>
<!-- ========CATEGORY WITHIN EDUCATION: ADDITIONAL RESOURCES======== -->
<a NAME="AdditionalResources"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Additional Resources</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="http://cancer.gov/cancerinformation/">Cancer
Information</a> - a wide range of accurate, credible cancer information brought
to
you by the National Cancer Institute (NCI). CancerNet information is reviewed
regularly by oncology experts and is based on the latest research. It includes
information selected and organized for patients, health professionals, and basic
researchers.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="http://www.genome.gov/Research/">Human Genome
Project</a>
- an international research effort to characterize the genomes of human and
selected
model organisms through complete mapping and sequencing of their DNA; to develop
technologies for genomic analysis; to examine the ethical, legal, and social
implications of human genetics research; and to train scientists who will be
able to
utilize the tools and resources developed through the HGP to pursue biological
studies that will improve human health. This link leads to the information
provided
on the <a href="http://www.genome.gov/">National Human Genome Research Institute
(NHGRI)</a> web site.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="http://www.genome.gov/Education/">NHGRI Educational
Resources</a> - the National Human Genome Research Institute (NHGRI) provides a
range of educational resources, including glossaries, fact sheets, multimedia
educational kits, genetic education modules for use by teachers, and a variety
of
online materials.</td>
</tr>
<!-- tr>
<td CLASS="TEXT"><a href="http://www.genome.gov/Careers/">NHGRI Careers and
Training</a> - describes education, training, and professional development
programs
at the National Human Genome Research Institute (NHGRI).</td>
</tr -->
<tr>
<td CLASS="TEXT"><a href="http://science-education.nih.gov/homepage.nsf">NIH
Office
of Science Education</a> - offers a wide variety of educational resources for students at various grade levels, teachers, and the general public. Resources cover a wide range of topics, including Genetics, and formats of educational materials range from lesson plans and curricula to multimedia, online materials, and more. Website also includes a section on career exploration.</td>
</tr>
<!-- tr>
<td CLASS="TEXT"><ul><li><a
href="http://science-education.nih.gov/Homepage.nsf/for+teachers?OpenForm">For
Teachers</a> <!-- - see <a
href="http://science-education.nih.gov/Homepage.nsf/menu?openform&ParentUNID=B12
6D47
047197EE68525658D00081276">educational resources</a>/<a
href="http://science-education.nih.gov/Homepage.nsf/menu?openform&ParentUNID=82D
60BF
825A3A55E852566F80050B84F">genetics learning tools</a --></ul></td>
</tr -->
<!-- tr>
<td CLASS="TEXT"><ul><li><a
href="http://science-education.nih.gov/Homepage.nsf/for+students?OpenForm">For
Students</a> <!-- - see <a
href="http://science-education.nih.gov/Homepage.nsf/menu?openform&ParentUNID=883
668D
EFFC4A338852565F60073195E">learn about science and medicine</a>/<a
href="http://science-education.nih.gov/Homepage.nsf/menu?openform&ParentUNID=C74
F69F
EE01D4222852566F2006ED1DA">genetics</a --></ul></td>
</tr -->
<!-- tr>
<td CLASS="TEXT"><ul><li><a
href="http://science-education.nih.gov/Homepage.nsf/for+the+public?OpenForm">For
the
Public</a> <!-- - see also <a href="http://health.nih.gov/">NIH Health
Information
Index</a --></ul></td>
</tr -->
</table>
<p></p>
<!-- =========================END_EDUCATION======================== -->
<!-- =============================FTP_SITE========================= -->
<a NAME="FTPSite"></a>
<p>
<table BORDER="0" CELLSPACING="0" CELLPADDING="3" WIDTH="98%">
<tr>
<td WIDTH="83%" BGCOLOR="#6699CC" CLASS="H3a">FTP Site</td>
<td WIDTH="13%" BGCOLOR="#6699CC" CLASS="H4a"><a
href="/Ftp/index.html">Overview</a></td>
<td WIDTH="3%" BGCOLOR="#6699CC" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup_white.gif" border="0" width="14" height="14"
ALT="back to top"></a></td>
</tr>
</table>
</p>
<!-- ========CATEGORY WITHIN FTP_SITE: DATABASES======== -->
<a NAME="FTPDatabases"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Download Databases</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/blast/db/">BLAST databases</a>
- a
collection of databases formatted for use with the BLAST software. A <a
href="ftp://ftp.ncbi.nih.gov/blast/db/README">readme</a> file provides database
descriptions.</td>
</tr>
</table>
<a NAME="FTP_GenBank"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">GenBank and Daily Updates</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/genbank/">GenBank flat
file
format</a> - see <a href="samplerecord.html">sample GenBank record</a> and
detailed
description in <a href=ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt>GenBank release
notes</a>; download most recent <b>full release</b> (described <a
href="#Overview">above</a>) and <b>daily cumulative or non-cumulative update</b>
files</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/ncbi-asn1/">ASN.1
format</a> - Abstract Syntax Notation 1, an International Standards Organization
(ISO) data representation format; download most recent full release (described
<a
href="#Overview">above</a>) and daily cumulative or non-cumulative update files.
&nbsp;(<a href="Summary/asn1.html">more on ASN.1</a>)</ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/blast/db/FASTA/">FASTA
format</a> - definition line followed by sequence data only (<a
href="/BLAST/fasta.html">example</a>). The FASTA formatted data are available
in
the BLAST databases directory of the FTP site. A <a
href="ftp://ftp.ncbi.nih.gov/blast/db/README">readme</a> file in that directory
provides descriptions of the available data sets, such as <b>nt.Z</b> (daily
updated
non-redundant BLAST nucleotide database, contains GenBank+EMBL+DDBJ+PDB
sequences,
but no EST, STS, GSS, or HTGS sequences), <b>nr.Z</b> (daily updated
non-redundant
proteins), <b>est.Z</b>, <b>gss.Z</b>, <b>htg.Z</b>, <b>sts.Z</b>, and
others.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/refseq/">RefSeq</a> - NCBI
database
of Reference Sequences. Curated, non-redundant set including genomic DNA
contigs,
mRNAs and proteins for known genes, mRNAs and proteins for gene models, and
entire
chromosomes. Accession numbers have the format of two letters, an underscore
bar,
and six digits, for example: &nbsp;NT_123456, NM_123456, NP_123456, NC_123456,
NG_123456, XM_123456, XR_123456, XP_123456 (more info about <a
href="/RefSeq/key.html#accession">accession numbers</a> and <a
href="/RefSeq/RSfaq.html#access">access</a>).</td>
</tr>
<tr>
<td CLASS="TEXT"><a
href="ftp://ftp.ncbi.nlm.nih.gov/gene/">Entrez Gene</a> -
a collection of files from the Entrez Gene database, which is <a
href="#EntrezGene">described</a> in the Molecular Databases/Genes section of this
guide.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/snp/">dbSNP</a> - database of
single nucleotide polymorphisms, small-scale insertions/deletions, polymorphic
repetitive elements, and microsatellite variation<SPACER TYPE=vertical
SIZE="10"></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/pub/taxonomy/">Taxonomy</a> -
data
from the NCBI Taxonomy database (described <a href="#Taxonomy">above</a>).
Includes
a UNIX compressed tar file called "taxdump.tar.Z" that is updated daily and
contains
a dump of the taxonomy information from SyBase. Note that the *.dmp files are
not
human-friendly files, but can be uploaded into SyBase with the
BCP facility. When you uncompress and untar the file, you will see several
files,
including a Readme file that contains more information.</td>
</tr>
</table>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/repository/">Repository of
databases</a> - This FTP directory contains a mix of NCBI databases (e.g.,
UniGene,
GeneMap, dbEST, dbGSS, dbSTS, OMIM) and a number of externally developed
databases
(e.g., EPD, TFD). The external databases are made available on the FTP site as
a
service to the scientific community. They are contributed by outside scientists
and
maintained independently of NCBI. All the files in the FTP directory of a
non-NCBI
database are placed there and maintained by the developers of that database.
Questions about non-NCBI databases should be directed to the contacts listed in
the
readme or other background files for the individual databases. Note that
additional
NCBI databases are also found in the <a href="ftp://ftp.ncbi.nih.gov/">root
directory</a> of the FTP site (under the database name, such as GenBank, Gene,
RefSeq), or in the <a href="ftp://ftp.ncbi.nih.gov/pub/">"pub" directory</a>
(usually under the name of the primary resource developer).</td>
</tr>
</table>
<p></p>
<!-- ======== CATEGORY WITHIN FTP_SITE: GENOMES ======== -->
<a NAME="FTP_Genomes"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Download Genomes</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<p></p>
<!-- ======== HUMAN_GENOME_PROJECT_FTP_DATA ======== -->
<a NAME="FTP_HumanGenome"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/"><b>Human
Genome
Project Data</b></a> - the <a
href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/">ftp://ftp.ncbi.nih.gov/genomes/
H_sa
piens/</a> directory contains one folder for each chromosome, which includes
genomic
contigs (NT_* records) built from finished and unfinished sequence data. The
contigs are available in various formats, described below. The <a
href="/genome/guide/build.html">contig assembly and annotation process</a> is
described in a separate document.
<table border="0" cellpadding="0">
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.asn</td>
<td CLASS="TEXT" width="79%" valign="top">ASN.1 format (description <a
href="#ASN1">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.fa.gz</td>
<td CLASS="TEXT" width="79%" valign="top">FASTA format (description <a
href="#FASTA">above</a>)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.gbk.gz</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank flat file format<br>
(annotations currently include STS markers; known and
predicted genes will be added in coming months)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.gbs</td>
<td CLASS="TEXT" width="79%" valign="top">GenBank summary format<br>
(this format does not contain sequence data, but instead
contains a "CONTIG" field, showing how the contig is assembled
from individual GenBank accessions)</td>
</tr>
<tr>
<td CLASS="TEXT" width="21%" valign="top">hs_chr*.mfa.gz</td>
<td CLASS="TEXT" width="79%" valign="top">masked FASTA format (masked
nucleotides
are lower case)</td>
</tr>
</table>
Data from the Map Viewer (described <a
href="#HumanChromosomeMapViews">above)</a>
are available in the <a
href="ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/maps/mapview/">ftp://ftp.ncbi.nih
.gov
/genomes/H_sapiens/maps/mapview/</a> subdirectory.
<SPACER TYPE=vertical SIZE="10">
</td>
</tr>
</table>
<!-- ======== END_HUMAN_GENOME_PROJECT_FTP_DATA ======== -->
<!-- ======== OTHER_GENOMES_FTP_DATA ======== -->
<a NAME="FTP_OtherGenomes"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT">
<FONT CLASS="H3">Other Genomes</FONT> - such as bacteria, nematode, mouse, and
others can be downloaded from one of two directories:
<ul>
<li><a
href="ftp://ftp.ncbi.nih.gov/genomes/">ftp://ftp.ncbi.nih.gov/genomes/</a> -
designed to contain genomes assembled and/or curated at NCBI. These genomes are
part of the <b>RefSeq</b> database, described <a href="#RefSeq">above</a>.
<li><a
href="ftp://ftp.ncbi.nih.gov/genbank/genomes/">ftp://ftp.ncbi.nih.gov/genbank/ge
nome
s/</a> - designed to contain complete genomes submitted in their entirety to
<b>GenBank</b>
</ul>
<blockquote><i>Note:</i> In some cases, an organism might be listed in both
directories. This can happen for several reasons: (1) there are two versions of
the
genome are available - one in GenBank, and one in RefSeq; or (2) the organism's
data
was assembled at NCBI and was available from the "/genbank/genomes/" directory
before the new "/genomes/" directory was set up. In the latter case, the data
now
exists in the new "/genomes/" directory, but a symbolic link was preserved in
the
original directory to facilitate user access.</blockquote>
</td>
</tr>
</table>
<!-- ======== END_OTHER_GENOMES_FTP_DATA ======== -->
<!-- ================CATEGORY WITHIN FTP_SITE: SOFTWARE================ -->
<a NAME="FTPSoftware"></a>
<table BORDER="0" WIDTH="91%" CELLSPACING="0" BGCOLOR="#e0eeee">
<tr>
<td WIDTH="96%" BGCOLOR="#e0eeee" class="H3">Download Software</td>
<td WIDTH="4%" BGCOLOR="#e0eeee" VALIGN="top" ALIGN="center">
<a href="#Top"><img SRC="arrowup.gif" border="0" width="14" height="14"
ALT="back to
top"></a></td>
</tr>
</table>
<a NAME="FTP_BLAST"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="H3">BLAST Programs</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST-BLAST/">BLAST
Stand-Alone
Program</a> - a set of executables which are <b>run by command line</b>.
Binaries
are provided for IRIX 6.2, Solaris 2.6, DEC OSF1 (ver. 4.0d), LINUX, and Win32
systems. Please read the <a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/documents/blast.txt">README</a> for more
information. There is also some
information on setting up Standalone BLAST at the NHGRI site at
<a
href="http://genome.nhgri.nih.gov/blastall/blast_install/">http://genome.nhgri.n
ih.g
ov/blastall/blast_install/</a>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST-WWWBLAST">BLAST Web
Server
Program</a> - allows you to set up your own <b>in-house version of the NCBI
BLAST
web pages</b> on a UNIX web server. You can set up the program to search your
own
custom databases or downloaded copies of the NCBI databases. This server is not
intended to handle the large loads which may exist in public service settings.
A <a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST-WWWBLAST/readme.html">
Read
me</a> file provides more information.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST-NETBLAST/">Network
BLAST</a> - a TCP/IP-based client-server version of WWW Gapped BLAST (2.0).
Makes a
direct connection with the NCBI databases over the Internet to retrieve data.
Client
software is available for PC, Mac, and Unix. For general information about
Gapped
BLAST, see <a href="#BLAST">above<a/>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><blockquote><b>NOTE:</b> <i><a
href="ftp://ftp.ncbi.nih.gov/blast/db/">Preformatted BLAST databases</a> also
available for downloading, in addition to the software listed above. A <a
href="ftp://ftp.ncbi.nih.gov/blast/db/README">readme</a> file provides database
descriptions.</i></blockquote></td>
</tr>
<tr>
<td CLASS="H3">Client/server programs</td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/sequin/">Sequin</a> -
submission software program for one or many submissions, long sequences,
complete
genomes, alignments, population/ phylogenetic/ mutation studies. Can be used as
a
stand-alone application or in a TCP/IP-based "network aware" mode, with links to
other NCBI resources and software such as <a
href="#Entrez">Entrez</a>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a href="ftp://ftp.ncbi.nih.gov/entrez/">Network
Entrez</a>
- a TCP/IP-based client-server version of WWW Entrez. Makes a direct connection
with
the NCBI databases over the Internet to retrieve data. Client software is
available
for PC, Mac, and Unix. For general information about Entrez, see <a
href="#Entrez">above<a/>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><ul><li><a
href="ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST-NETBLAST/">Network
BLAST</a> - see description <a href="#FTP_BLAST">above<a/>.</li></ul></td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/cn3d/">Cn3D</a> - "See in 3-D,"
a
structure and sequence alignment viewer for NCBI databases. It allows viewing
of
3-D structures and sequence-structure or structure-structure alignments. Cn3D
can
work as a helper application to your browser, or as a client-server application
that
retrieves structure records from MMDB (described <a href="#MMDB">above</a>)
directly
over the internet. The <a href="/Structure/CN3D/cn3d.shtml">Cn3D home page</a>
provides access to information on how to <a
href="/Structure/CN3D/cn3dinstall.shtml">install</a> the program, a <a
href="/Structure/CN3D/cn3dtut.shtml">tutorial</a> to get started, and a
comprehensive <a href="/Structure/CN3D/cn3dhelp.shtml">help document</a>.</td>
</tr>
</table>
<a NAME="FTP_ToolBox"></a>
<table BORDER="0" CELLSPACING="5" WIDTH="90%" BGCOLOR="#FFFFFF">
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/toolbox/">NCBI Software
ToolBox</a>
- set of software and data exchange specifications used by NCBI to produce
portable,
modular software for molecular biology. The software in the Toolbox is
primarily
designed to read Abstract Syntax Notation 1 (ASN.1) format records, an
International
Standards Organization (ISO) data representation format. The software is
available
to the public in the toolbox/ncbi_tools directory of NCBI's ftp site, and can be
used in its own right or as a foundation for building tools with similar
properties.
The readme files in the <a href="ftp://ftp.ncbi.nih.gov/toolbox/">toolbox</a>
and
<a href="ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/">toolbox/ncbi_tools</a>
directories of the FTP site contain more information about the toolbox and
ASN.1.
An <a href="Summary/asn1.html">ASN.1 summary</a> is also available. The
ToolBox
can produce data as either ASN.1, as before, or as XML (<a
href="/IEB/ToolBox/XML/ncbixml.txt">more about XML</a>). Additional information
about the ToolBox, documentation, and demo programs are available on the <a
href="/IEB/ToolBox/">NCBI ToolBox</a> page. Additional information about the
Information Engineering Branch (IEB) of NCBI, which develops the ToolBox, is
provided <a href="#SoftwareEngineering">above</a>, along with other items of
interest to software developers.</td>
</tr>
<tr>
<td CLASS="TEXT"><a href="ftp://ftp.ncbi.nih.gov/pub/">Software programs
developed
as personal projects by various NCBI scientists</a> - /pub directory of FTP site
contains programs such as MACAW (multiple sequence alignments) and e-PCR
(description <a href="#ePCR">above</a>).</td>
</tr>
</table>
<p></p>
<!-- ===============================END_FTP_SITE===================== -->
<!-- ============PLACE_EXTRA_TITLE_BARS_ABOVE_HERE================ -->
<!-- ===================END_OF_CONTENT============================ -->
<table BORDER=0 CELLSPACING=0 CELLPADDING=3 WIDTH="100%" BGCOLOR="#003366" >
<tr ALIGN=CENTER>
<td WIDTH="20%"><a href="mailto:info@ncbi.nlm.nih.gov" class="BAR">Help
Desk</a></td>
<td WIDTH="20%"><a href="http://www.ncbi.nlm.nih.gov" class="BAR">NCBI</a></td>
<td WIDTH="20%"><a href="http://www.nlm.nih.gov" class="BAR">NLM</a></td>
<td WIDTH="20%"><a href="http://www.nih.gov" class="BAR">NIH</a></td>
<td WIDTH="20%"><a href="credits.html" class="BAR">Credits</a></td>
</tr>
</table>
<table BORDER=0 CELLSPACING=0 CELLPADDING=3 WIDTH="100%" BGCOLOR="#FFFFFF" >
<tr>
<td CLASS="TEXT2">
<p><i><FONT size="2">Revised: February 3, 2009.</FONT></i><br>
<FONT size="2"><i>Questions about NCBI resources to</i> &nbsp;<a
href="mailto:info@ncbi.nlm.nih.gov">info@ncbi.nlm.nih.gov</a></FONT><br>
<p><FONT size="2"><a href="/About/disclaimer.html">Disclaimer</a></FONT>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<FONT size="2"><a
href="http://www.nlm.nih.gov/privacy.html">Privacy statement</a></FONT></p>
</td>
</tr>
</table>
</body>
</html>