Entrez Help Document
PubMed Entrez BLAST OMIM Taxonomy Structure

Last modified : July 18, 2000

Summary Matrices

This document provides the following summary tables for the Entrez Nucleotide, Protein, Genome, Structure, and Popset data domains:

Limits Available by Database
Search Fields Available by Database
Search Field Desriptions and Qualifiers
Display Formats

The PubMed help document contains separate information about the Limits, Search Fields, and Display Formats available for that database.

Back to the Entrez Help Document

Limits Available by Database

Databases
Limits Nucleotide Protein Genome Structure PopSet
Search Fields Yes Yes Yes Yes Yes
Exclude ESTs Yes No No No No
Exclude STSs Yes No No No No
Exclude GSSs Yes No No No No
Exclude Working Draft Yes No No No No
Exclude Patents Yes Yes No No No
Molecule Type Yes No No No No
Gene Location Yes Yes No No No
Segmented Sequences Yes Yes No No No
Database Source Yes Yes No No No
Modification Date Yes Yes No No No

Back to the Entrez Help Document

Search Fields Available by Database

Databases
Search Field Descriptions and Qualifiers Nucleotide Protein Genome Structure PopSet
Accession Yes Yes Yes Yes Yes
All Fields Yes Yes Yes Yes Yes
Author Name Yes Yes Yes Yes Yes
EC/RN Number Yes Yes Yes Yes Yes
Feature Key Yes No Yes No Yes
Filter Yes Yes Yes Yes Yes
Gene Name Yes Yes Yes No Yes
Issue Yes Yes Yes Yes Yes
Journal Name Yes Yes Yes Yes Yes
Keyword Yes Yes Yes No Yes
Modification Date Yes Yes Yes Yes Yes
Molecular Weight No Yes No No No
Organism Yes Yes Yes Yes Yes
Page Number Yes Yes Yes Yes Yes
Primary Accession Yes Yes Yes No Yes
Properties Yes Yes Yes No Yes
Protein Name Yes Yes Yes No Yes
Publication Date Yes Yes Yes Yes Yes
SeqID String Yes Yes Yes No Yes
Sequence Length Yes Yes Yes No No
Substance Name Yes Yes No Yes No
Text Word Yes Yes Yes Yes Yes
Title Word Yes Yes Yes No No
Uid No No No No No
Volume Yes Yes Yes Yes Yes

Back to the Entrez Help Document

Search Field Descriptions and Qualifiers

Search Field Definition Qualifier
Accession Contains the unique accession number of the sequence or record, assigned to the nucleotide, protein, structure, genome record, or PopSet by a sequence database builder. The Structure database accession index contains the PDB IDs but not the MMDB IDs. [ACCN]
All Fields Contains all terms from all searchable database fields in the database. [ALL]
Author Name Contains all authors from all references in the database records. The format is last name space first initial(s), without punctuation (e.g., marley jf). [AUTH]
EC/RN Number Number assigned by the Enzyme Commission or Chemical Abstract Service (CAS) to designate a particular enzyme or chemical, respectively. [ECNO]
Feature Key Contains the biological features assigned or annotated to the nucleotide sequences and defined in the DDBJ/EMBL/GenBank Feature Table (http://www.ncbi.nlm.nih.gov/projects/collab/FT/index.html). Not available for the Protein or Structure databases. [FKEY]
Filter Contains predetermined or filtered subsets of the various databases. These subsets or filters are created by grouping records that are commonly linked to other Entrez databases or within the same database.

For example, the PopSet database Filter index includes PopSet all, PopSet medline, PopSet nucleotide, and PopSet protein. The PopSet medline filter includes all PopSet records with links to PubMed; the PopSet nucleotide filter includes all PopSet records with links to the nucleotide database; and, the PopSet protein filter includes all PopSet records with links to the protein database. The PopSet all filter includes all PopSet records.

The Nucleotide database Filter index contains a great deal more filters because the database records are linked to numerous external links. For more information see Link Out.

[FILT]
Gene Name Contains the standard and common names of genes found in the database records. This field is not available in Structure database. [GENE]
Issue Contains the issue number of the journal in which the data were published. [ISS]
Journal Name Contains the name of the journal in which the data were published. Journal names are indexed in the database in abbreviated form (e.g., J Biol Chem). Journals are also indexed by their by ISSNs. Browse the index if you do not know the ISSN or are not sure how a particular journal name is abbreviated. [JOUR]
Keyword Contains special index terms from the controlled vocabularies associated with the GenBank, EMBL, DDBJ, SWISS-Prot, PIR, PRF, or PDB databases. Browse the Keyword indexes of the individual databases to become familiar with these vocabularies. A Keyword index is not available in the Structure database. [KYWD]
Modification Date Contains the date that the most recent modification to that record is indexed in Entrez, in the format YYYY/MM/DD (e.g., 1999/08/05). A year alone, (e.g., 1999) will retrieve all records modified for that year; a year and month (e.g., 1999/03) retrieves all records modified for that month that are indexed in Entrez. [MDAT]
Molecular Weight Molecular weight of a protein, in Daltons (Da), calculated by the method described in the Searching by Molecular Weight section of the Entrez help document. Note that molecular weight must be entered as a fixed 6 digit field, filled with leading zeros (not letter O), e.g., 002002 [MOLWT] [MOLWT]
Organism Contains the scientific and common names for the organisms associated with protein and nucleotide sequences. [ORGN]
Page Number Contains the number of the first journal page of the article in which the data were published. [PAGE]
Primary Accession Contains the primary accession number of the sequence or record, assigned to the nucleotide, protein, structure, genome record, or PopSet by a sequence database builder. A Primary Accession index is not available in the Structure database. [PACC]
Properties Contains properties of the nucleotide or protein sequence. For example, the Nucleotide database's Properties index includes molecule types, publication status, molecule locations, and GenBank divisions. A Properties index is not available in the Structure database. [PROP]
Protein Name Contains the standard names of proteins found in database records. Common names may not be indexed in this field so it is best to also consider All Fields or Text Words. A Protein Name index is not available in the Structure database. [PROT]
Publication Date Contains the date that records are released into Entrez, in the format YYYY/MM/DD (e.g., 1999/08/05). It is the date the entry first appeared in GenBank explicitly indexed in Entrez. A year alone, (e.g., 1999) will retrieve all records for that year; a year and month (e.g., 1999/03) will retrieve all records released into GenBank for that month. [PDAT]
SeqID String Contains the special string identifier, similar to a FASTA identifier, for a given sequence. A SeqID String index is not available in the Structure database. [SQID]
Sequence Length Contains the total length of the sequence. Sequence Length indexes are not available in the Structure or PopSet databases. [SLEN]
Substance Name Contains the names of any chemicals associated with this record from the CAS registry and the MEDLINE Name of Substance field. Substance Name indexes are not available in the Genome or PopSet databases. [SUBS]
Text Word Contains all of the "free text" associated with a record. [WORD]
Title Word Includes only those words found in the definition line of a record. The definition line summarizes the biology of the sequence and is carefully constructed by database staff. A standard definition line will include the organism, product name, gene symbol, molecule type and whether it is a partial or complete cds. Title Word indexes are not available in the Structure or PopSet databases. [TITL]
Uid Contains the Medline unique identifier for records that contain published references that are linked to PubMed. The Uid index is not browsable. [UID]
Volume Contains the volume number of the journal in which the data were published. [VOL]

Back to the Entrez Help Document

Display Formats

Display Format Description Databases Available Link
Summary Default display, hotlinked Accession number and brief description All databases None
Brief Hotlinked Accession number and abbreviated description All databases None
GenBank/GenPept Full report format Nucleotide, Protein None
ASN.1 Abstract Syntax Notation 1 form, the computer-readable form of the data All databases None
FASTA The definition line and sequence characters All databases None
Nucleotide Neighbors Retrieves all similar nucleotide sequences for all documents retrieved and displays in default format Nucleotide Related Sequences
Protein Neighbors Retrieves all similar protein sequences for all documents retrieved and displays in default format Protein Related Sequences
Genome Neighbors Retrieves all similar genome sequences for all documents retrieved and displays in default format Genome Related Sequences
Structure Neighbors Retrieves all similar structures for all documents retrieved and displays in default format Structure Related Sequences
Provider Links Retrieves all external links for all documents retrieved and displays in default format - see Link Out for more information All databases LinkOut
PubMed Links Retrieves all Medline links for all documents retrieved and displays in default format All databases PubMed
Nucleotide Links Retrieves all Nucleotide links for all documents retrieved and displays in default format All databases, except Nucleotide Nucleotide
Protein Links Retrieves all Protein links for all documents retrieved and displays in default format All databases, except Protein Protein
Genome Links Retrieves all Genome links for all documents retrieved and displays in default format Nucleotide, Protein, and Structure Genomes
Structure Links Retrieves all Structure links for all documents retrieved and displays in default format Nucleotide, Protein, and Genome Structure
PopSet Links Retrieves all PopSet links for all documents retrieved and displays in default format All databases PopSet
Graphic Summary The graphical view of the sequence accessible by selecting the hotlinked Accession numbers Nucleotide, Protein, and Genome None
Structure Summary The Structure Summary accessible by selecting the hotlinked PDB numbers Structure None
PopSet Summary The complete set of Accession Numbers comprising the PopSet accessible by selecting the hotlinked PopSet Accession Numbers PopSet None

Back to the Entrez Help Document