accession
download an ortholog dataset by RefSeq nucleotide or protein accession
accession
Name
datasets download ortholog accession - download an ortholog dataset by RefSeq nucleotide or protein accession
Synopsis
datasets download ortholog accession <refseq-accession> [flags]
Description
Download an ortholog dataset by RefSeq nucleotide or protein accession. Ortholog data is calculated by NCBI for vertebrates and insects. Ortholog data packages include gene, transcript and protein sequence, a data table and a data report. Datasets are downloaded as a zip file.
The default ortholog dataset includes the following files:
- gene.fna (gene sequences)
- rna.fna (transcript sequences)
- protein.faa (protein sequences)
- data_report.jsonl (data report with gene metadata)
- data_table.tsv (data table with gene metadata, one transcript per row)
- dataset_catalog.json (a list of files and file types included in the dataset)
Refer to NCBI’s download and install documentation for information about getting started with the command-line tools.
Examples
datasets download ortholog accession NP_000483.3
datasets download ortholog accession NM_000546.6
Options
--api-key string NCBI Datasets API Key
--exclude-gene exclude gene.fna (gene sequence file)
--exclude-protein exclude protein.faa (protein sequence file)
--exclude-rna exclude rna.fna (transcript sequence file)
--filename string specify a custom file name for the downloaded dataset (default "ncbi_dataset.zip")
-h, --help help for accession
--include-3p-utr include 3p_utr.fna (3'-UTR sequence file)
--include-5p-utr include 5p_utr.fna (5'-UTR sequence file)
--include-cds include cds.fna (CDS sequence file)
--inputfile string read a list of RefSeq nucleotide or protein accessions from a file to use as input
--no-progressbar hide progress bar
--taxon-filter strings limit results to ortholog data for a specified taxonomic group
Generated March 11, 2025