Sequence Alignment
- PMID: 29206392
- Bookshelf ID: NBK464187
Sequence Alignment
Excerpt
Alignments are a powerful way to compare related DNA or protein sequences. They can be used to capture various facts about the sequences aligned, such as common evolutionary descent or common structural function. We take the general view that the alignment of letters from two or multiple sequences represents the hypothesis that they are descended from a common ancestral sequence.
DNA molecules are composed of chains of nucleotides, and protein molecules are composed of chains of amino acids. The specific order of nucleotides or amino acids within these chains are respectively called DNA and protein sequences. Perhaps chief among the various biological functions of DNA sequences is to encode protein sequences, because proteins are involved in most of the biological functions of living cells.
DNA sequences, and the protein sequences they encode, evolve by mutation followed by natural selection. There are a variety of mechanisms for DNA mutation, but the most common result is the substitution of a single nucleotide for another, or the deletion or insertion of one or several adjacent nucleotides. At the protein level, the most common resulting mutations are the substitution of one amino acid for another, or the insertion or deletion of one or multiple adjacent amino acids. There is no simple biological mechanism for exchanging the order of two letters in a DNA or protein sequence, so an alignment representing the common descent of two DNA or protein sequences is co-linear, with no “crossovers” between corresponding letters.
© 2018 by Taylor & Francis Group, LLC.
Sections
- 20.1.1. GLOBAL AND LOCAL PAIRWISE ALIGNMENTS
- 20.1.2. PAIRWISE ALIGNMENT SCORES
- 20.1.3. PATH GRAPHS AND OPTIMAL GLOBAL PAIRWISE ALIGNMENT
- 20.1.4. OPTIMAL LOCAL PAIRWISE ALIGNMENT
- 20.1.5. SUBSTITUTION MATRICES
- 20.1.6. AFFINE GAP SCORES
- 20.1.7. SEQUENCE ALIGNMENT HEURISTICS
- 20.1.8. EXACT STRING MATCHING
- 20.1.9. INDEXING METHODS FOR STRING MATCHING
- REFERENCES
Similar articles
-
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences.BMC Bioinformatics. 2005 Jun 22;6:156. doi: 10.1186/1471-2105-6-156. BMC Bioinformatics. 2005. PMID: 15969769 Free PMC article.
-
Bayesian coestimation of phylogeny and sequence alignment.BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83. BMC Bioinformatics. 2005. PMID: 15804354 Free PMC article.
-
Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.Mol Biol Evol. 2015 Mar;32(3):806-19. doi: 10.1093/molbev/msu340. Epub 2014 Dec 21. Mol Biol Evol. 2015. PMID: 25534034 Free PMC article.
-
ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.Protein Eng Des Sel. 2016 Jul;29(7):263-70. doi: 10.1093/protein/gzw016. Epub 2016 May 30. Protein Eng Des Sel. 2016. PMID: 27261456
-
Evolution at the nucleotide level: the problem of multiple whole-genome alignment.Hum Mol Genet. 2006 Apr 15;15 Spec No 1:R51-6. doi: 10.1093/hmg/ddl056. Hum Mol Genet. 2006. PMID: 16651369 Review.
References
Printed Resources:
-
- Altschul S. F., Erickson B. W. Optimal sequence alignment using affine gap costs. Bulletin of Mathematical Biology. 1986;48:603–616. - PubMed
-
- Altschul S. F., Erickson B. W. Locally optimal subalignments using nonlinear similarity functions. Bulletin of Mathematical Biology. 1986;48:633–660. - PubMed
-
- Altschul S. F., Gish W., Miller W., Myers E. W., Lipman D. J. Basic local alignment search tool. Journal of Molecular Biology. 1990;215:403–410. - PubMed
Web Resources:
-
- http://mbcf149.dfci.harvard.edu/cmsmbr/biotools/biotools16.html (List of sequence alignment servers and databases.)
-
- http://rosalind.info/problems/locations/ (Online platform for learning bioinformatics algorithms through coding. Includes an extensive collection of exercises and problems related to sequence alignment.)
-
- http://www.ebi.ac.uk/Tools/emboss/align/ (Local and global tools for pairwise sequence alignment.)
-
- http://www-igm.univ-mlv.fr/~lecroq/string/ (Exact string matching algorithms in C.)
-
- http://www.langmead-lab.org/teaching-materials/ (Videos and lecture slides for string matching algorithms from Ben Langmead, including Python code.)
Publication types
LinkOut - more resources
Full Text Sources