Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review

Sequence Alignment

In: Handbook of Discrete and Combinatorial Mathematics. 2nd edition. Boca Raton (FL): CRC Press/Taylor & Francis; 2017 Nov. 20.1.
Affiliations
Free Books & Documents
Review

Sequence Alignment

Stephen F. Altschul et al.
Free Books & Documents

Excerpt

Alignments are a powerful way to compare related DNA or protein sequences. They can be used to capture various facts about the sequences aligned, such as common evolutionary descent or common structural function. We take the general view that the alignment of letters from two or multiple sequences represents the hypothesis that they are descended from a common ancestral sequence.

DNA molecules are composed of chains of nucleotides, and protein molecules are composed of chains of amino acids. The specific order of nucleotides or amino acids within these chains are respectively called DNA and protein sequences. Perhaps chief among the various biological functions of DNA sequences is to encode protein sequences, because proteins are involved in most of the biological functions of living cells.

DNA sequences, and the protein sequences they encode, evolve by mutation followed by natural selection. There are a variety of mechanisms for DNA mutation, but the most common result is the substitution of a single nucleotide for another, or the deletion or insertion of one or several adjacent nucleotides. At the protein level, the most common resulting mutations are the substitution of one amino acid for another, or the insertion or deletion of one or multiple adjacent amino acids. There is no simple biological mechanism for exchanging the order of two letters in a DNA or protein sequence, so an alignment representing the common descent of two DNA or protein sequences is co-linear, with no “crossovers” between corresponding letters.

PubMed Disclaimer

Similar articles

References

Printed Resources:

    1. Altschul S. F. Amino acid substitution matrices from an information theoretic perspective. Journal of Molecular Biology. 1991;219:555–565. - PMC - PubMed
    1. Altschul S. F., Erickson B. W. Optimal sequence alignment using affine gap costs. Bulletin of Mathematical Biology. 1986;48:603–616. - PubMed
    1. Altschul S. F., Erickson B. W. Locally optimal subalignments using nonlinear similarity functions. Bulletin of Mathematical Biology. 1986;48:633–660. - PubMed
    1. Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. - PubMed
    1. Altschul S. F., Madden T. L., Scha¨ffer A. A., Zhang J., Zhang Z., Miller W., Lipman D. J. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research. 1997;25:3389–3402. - PMC - PubMed

Web Resources:

    1. http://mbcf149.dfci.harvard.edu/cmsmbr/biotools/biotools16.html (List of sequence alignment servers and databases.)
    1. http://rosalind.info/problems/locations/ (Online platform for learning bioinformatics algorithms through coding. Includes an extensive collection of exercises and problems related to sequence alignment.)
    1. http://www.ebi.ac.uk/Tools/emboss/align/ (Local and global tools for pairwise sequence alignment.)
    1. http://www-igm.univ-mlv.fr/~lecroq/string/ (Exact string matching algorithms in C.)
    1. http://www.langmead-lab.org/teaching-materials/ (Videos and lecture slides for string matching algorithms from Ben Langmead, including Python code.)

LinkOut - more resources