Entrez Help Document: Summary Matrices

Entrez Help Document

Last modified : July 18, 2000

Summary Matrices

This document provides the following summary tables for the Entrez Nucleotide, Protein, Genome, Structure, and Popset data domains:

Limits Available by Database
Search Fields Available by Database
Search Field Desriptions and Qualifiers
Display Formats

The PubMed help document contains separate information about the Limits, Search Fields, and Display Formats available for that database.

Back to the Entrez Help Document

Limits Available by Database
						Limits	Nucleotide	Protein	Genome	Structure	PopSet
						Databases
						Search Fields	Yes	Yes	Yes	Yes	Yes
						Exclude ESTs	Yes	No	No	No	No
						Exclude STSs	Yes	No	No	No	No
						Exclude GSSs	Yes	No	No	No	No
						Exclude Working Draft	Yes	No	No	No	No
						Exclude Patents	Yes	Yes	No	No	No
						Molecule Type	Yes	No	No	No	No
						Gene Location	Yes	Yes	No	No	No
						Segmented Sequences	Yes	Yes	No	No	No
						Database Source	Yes	Yes	No	No	No
						Modification Date	Yes	Yes	No	No	No

Back to the Entrez Help Document

Search Fields Available by Database
						Search Field Descriptions and Qualifiers	Nucleotide	Protein	Genome	Structure	PopSet
							Databases
						Accession	Yes	Yes	Yes	Yes	Yes
						All Fields	Yes	Yes	Yes	Yes	Yes
						Author Name	Yes	Yes	Yes	Yes	Yes
						EC/RN Number	Yes	Yes	Yes	Yes	Yes
						Feature Key	Yes	No	Yes	No	Yes
						Filter	Yes	Yes	Yes	Yes	Yes
						Gene Name	Yes	Yes	Yes	No	Yes
						Issue	Yes	Yes	Yes	Yes	Yes
						Journal Name	Yes	Yes	Yes	Yes	Yes
						Keyword	Yes	Yes	Yes	No	Yes
						Modification Date	Yes	Yes	Yes	Yes	Yes
						Molecular Weight	No	Yes	No	No	No
						Organism	Yes	Yes	Yes	Yes	Yes
						Page Number	Yes	Yes	Yes	Yes	Yes
						Primary Accession	Yes	Yes	Yes	No	Yes
						Properties	Yes	Yes	Yes	No	Yes
						Protein Name	Yes	Yes	Yes	No	Yes
						Publication Date	Yes	Yes	Yes	Yes	Yes
						SeqID String	Yes	Yes	Yes	No	Yes
						Sequence Length	Yes	Yes	Yes	No	No
						Substance Name	Yes	Yes	No	Yes	No
						Text Word	Yes	Yes	Yes	Yes	Yes
						Title Word	Yes	Yes	Yes	No	No
						Uid	No	No	No	No	No
						Volume	Yes	Yes	Yes	Yes	Yes

Back to the Entrez Help Document

Search Field Descriptions and Qualifiers
			Search Field	Definition	Qualifier
			Accession	Contains the unique accession number of the sequence or record, assigned to the nucleotide, protein, structure, genome record, or PopSet by a sequence database builder. The Structure database accession index contains the PDB IDs but not the MMDB IDs.	[ACCN]
			All Fields	Contains all terms from all searchable database fields in the database.	[ALL]
			Author Name	Contains all authors from all references in the database records. The format is last name space first initial(s), without punctuation (e.g., marley jf).	[AUTH]
			EC/RN Number	Number assigned by the Enzyme Commission or Chemical Abstract Service (CAS) to designate a particular enzyme or chemical, respectively.	[ECNO]
			Feature Key	Contains the biological features assigned or annotated to the nucleotide sequences and defined in the DDBJ/EMBL/GenBank Feature Table (http://www.ncbi.nlm.nih.gov/projects/collab/FT/index.html). Not available for the Protein or Structure databases.	[FKEY]
			Filter	Contains predetermined or filtered subsets of the various databases. These subsets or filters are created by grouping records that are commonly linked to other Entrez databases or within the same database. For example, the PopSet database Filter index includes PopSet all, PopSet medline, PopSet nucleotide, and PopSet protein. The PopSet medline filter includes all PopSet records with links to PubMed; the PopSet nucleotide filter includes all PopSet records with links to the nucleotide database; and, the PopSet protein filter includes all PopSet records with links to the protein database. The PopSet all filter includes all PopSet records. The Nucleotide database Filter index contains a great deal more filters because the database records are linked to numerous external links. For more information see Link Out.	[FILT]
			Gene Name	Contains the standard and common names of genes found in the database records. This field is not available in Structure database.	[GENE]
			Issue	Contains the issue number of the journal in which the data were published.	[ISS]
			Journal Name	Contains the name of the journal in which the data were published. Journal names are indexed in the database in abbreviated form (e.g., J Biol Chem). Journals are also indexed by their by ISSNs. Browse the index if you do not know the ISSN or are not sure how a particular journal name is abbreviated.	[JOUR]
			Keyword	Contains special index terms from the controlled vocabularies associated with the GenBank, EMBL, DDBJ, SWISS-Prot, PIR, PRF, or PDB databases. Browse the Keyword indexes of the individual databases to become familiar with these vocabularies. A Keyword index is not available in the Structure database.	[KYWD]
			Modification Date	Contains the date that the most recent modification to that record is indexed in Entrez, in the format YYYY/MM/DD (e.g., 1999/08/05). A year alone, (e.g., 1999) will retrieve all records modified for that year; a year and month (e.g., 1999/03) retrieves all records modified for that month that are indexed in Entrez.	[MDAT]
			Molecular Weight	Molecular weight of a protein, in Daltons (Da), calculated by the method described in the Searching by Molecular Weight section of the Entrez help document. Note that molecular weight must be entered as a fixed 6 digit field, filled with leading zeros (not letter O), e.g., 002002 [MOLWT]	[MOLWT]
			Organism	Contains the scientific and common names for the organisms associated with protein and nucleotide sequences.	[ORGN]
			Page Number	Contains the number of the first journal page of the article in which the data were published.	[PAGE]
			Primary Accession	Contains the primary accession number of the sequence or record, assigned to the nucleotide, protein, structure, genome record, or PopSet by a sequence database builder. A Primary Accession index is not available in the Structure database.	[PACC]
			Properties	Contains properties of the nucleotide or protein sequence. For example, the Nucleotide database's Properties index includes molecule types, publication status, molecule locations, and GenBank divisions. A Properties index is not available in the Structure database.	[PROP]
			Protein Name	Contains the standard names of proteins found in database records. Common names may not be indexed in this field so it is best to also consider All Fields or Text Words. A Protein Name index is not available in the Structure database.	[PROT]
			Publication Date	Contains the date that records are released into Entrez, in the format YYYY/MM/DD (e.g., 1999/08/05). It is the date the entry first appeared in GenBank explicitly indexed in Entrez. A year alone, (e.g., 1999) will retrieve all records for that year; a year and month (e.g., 1999/03) will retrieve all records released into GenBank for that month.	[PDAT]
			SeqID String	Contains the special string identifier, similar to a FASTA identifier, for a given sequence. A SeqID String index is not available in the Structure database.	[SQID]
			Sequence Length	Contains the total length of the sequence. Sequence Length indexes are not available in the Structure or PopSet databases.	[SLEN]
			Substance Name	Contains the names of any chemicals associated with this record from the CAS registry and the MEDLINE Name of Substance field. Substance Name indexes are not available in the Genome or PopSet databases.	[SUBS]
			Text Word	Contains all of the "free text" associated with a record.	[WORD]
			Title Word	Includes only those words found in the definition line of a record. The definition line summarizes the biology of the sequence and is carefully constructed by database staff. A standard definition line will include the organism, product name, gene symbol, molecule type and whether it is a partial or complete cds. Title Word indexes are not available in the Structure or PopSet databases.	[TITL]
			Uid	Contains the Medline unique identifier for records that contain published references that are linked to PubMed. The Uid index is not browsable.	[UID]
			Volume	Contains the volume number of the journal in which the data were published.	[VOL]

Back to the Entrez Help Document

Display Formats
				Display Format	Description	Databases Available	Link
				Summary	Default display, hotlinked Accession number and brief description	All databases	None
				Brief	Hotlinked Accession number and abbreviated description	All databases	None
				GenBank/GenPept	Full report format	Nucleotide, Protein	None
				ASN.1	Abstract Syntax Notation 1 form, the computer-readable form of the data	All databases	None
				FASTA	The definition line and sequence characters	All databases	None
				Nucleotide Neighbors	Retrieves all similar nucleotide sequences for all documents retrieved and displays in default format	Nucleotide	Related Sequences
				Protein Neighbors	Retrieves all similar protein sequences for all documents retrieved and displays in default format	Protein	Related Sequences
				Genome Neighbors	Retrieves all similar genome sequences for all documents retrieved and displays in default format	Genome	Related Sequences
				Structure Neighbors	Retrieves all similar structures for all documents retrieved and displays in default format	Structure	Related Sequences
				Provider Links	Retrieves all external links for all documents retrieved and displays in default format - see Link Out for more information	All databases	LinkOut
				PubMed Links	Retrieves all Medline links for all documents retrieved and displays in default format	All databases	PubMed
				Nucleotide Links	Retrieves all Nucleotide links for all documents retrieved and displays in default format	All databases, except Nucleotide	Nucleotide
				Protein Links	Retrieves all Protein links for all documents retrieved and displays in default format	All databases, except Protein	Protein
				Genome Links	Retrieves all Genome links for all documents retrieved and displays in default format	Nucleotide, Protein, and Structure	Genomes
				Structure Links	Retrieves all Structure links for all documents retrieved and displays in default format	Nucleotide, Protein, and Genome	Structure
				PopSet Links	Retrieves all PopSet links for all documents retrieved and displays in default format	All databases	PopSet
				Graphic Summary	The graphical view of the sequence accessible by selecting the hotlinked Accession numbers	Nucleotide, Protein, and Genome	None
				Structure Summary	The Structure Summary accessible by selecting the hotlinked PDB numbers	Structure	None
				PopSet Summary	The complete set of Accession Numbers comprising the PopSet accessible by selecting the hotlinked PopSet Accession Numbers	PopSet	None

Back to the Entrez Help Document