DNA Sequence Explorer
Align DNA sequences using global and local alignment algorithms, translate codons to amino acids, introduce mutations to study their effects, and build phylogenetic trees to visualize evolutionary relationships.
Alignment View
Controls
Results
Scoring Matrix Heatmap
| - | A | T | G | C | A | A | G | C | T | T | C | G | A | T | C | G | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| - | 0 | -2 | -4 | -6 | -8 | -10 | -12 | -14 | -16 | -18 | -20 | -22 | -24 | -26 | -28 | -30 | -32 |
| A | -2 | 1 | -1 | -3 | -5 | -7 | -9 | -11 | -13 | -15 | -17 | -19 | -21 | -23 | -25 | -27 | -29 |
| T | -4 | -1 | 2 | 0 | -2 | -4 | -6 | -8 | -10 | -12 | -14 | -16 | -18 | -20 | -22 | -24 | -26 |
| G | -6 | -3 | 0 | 3 | 1 | -1 | -3 | -5 | -7 | -9 | -11 | -13 | -15 | -17 | -19 | -21 | -23 |
| C | -8 | -5 | -2 | 1 | 4 | 2 | 0 | -2 | -4 | -6 | -8 | -10 | -12 | -14 | -16 | -18 | -20 |
| T | -10 | -7 | -4 | -1 | 2 | 3 | 1 | -1 | -3 | -3 | -5 | -7 | -9 | -11 | -13 | -15 | -17 |
| A | -12 | -9 | -6 | -3 | 0 | 3 | 4 | 2 | 0 | -2 | -4 | -6 | -8 | -8 | -10 | -12 | -14 |
| G | -14 | -11 | -8 | -5 | -2 | 1 | 2 | 5 | 3 | 1 | -1 | -3 | -5 | -7 | -9 | -11 | -11 |
| C | -16 | -13 | -10 | -7 | -4 | -1 | 0 | 3 | 6 | 4 | 2 | 0 | -2 | -4 | -6 | -8 | -10 |
| A | -18 | -15 | -12 | -9 | -6 | -3 | 0 | 1 | 4 | 5 | 3 | 1 | -1 | -1 | -3 | -5 | -7 |
| T | -20 | -17 | -14 | -11 | -8 | -5 | -2 | -1 | 2 | 5 | 6 | 4 | 2 | 0 | 0 | -2 | -4 |
| C | -22 | -19 | -16 | -13 | -10 | -7 | -4 | -3 | 0 | 3 | 4 | 7 | 5 | 3 | 1 | 1 | -1 |
| G | -24 | -21 | -18 | -15 | -12 | -9 | -6 | -3 | -2 | 1 | 2 | 5 | 8 | 6 | 4 | 2 | 2 |
| A | -26 | -23 | -20 | -17 | -14 | -11 | -8 | -5 | -4 | -1 | 0 | 3 | 6 | 9 | 7 | 5 | 3 |
| T | -28 | -25 | -22 | -19 | -16 | -13 | -10 | -7 | -6 | -3 | 0 | 1 | 4 | 7 | 10 | 8 | 6 |
| C | -30 | -27 | -24 | -21 | -18 | -15 | -12 | -9 | -6 | -5 | -2 | 1 | 2 | 5 | 8 | 11 | 9 |
| G | -32 | -29 | -26 | -23 | -20 | -17 | -14 | -11 | -8 | -7 | -4 | -1 | 2 | 3 | 6 | 9 | 12 |
Reference Guide
Needleman-Wunsch Scoring
Global alignment finds the best end-to-end alignment of two sequences using dynamic programming. The score at each cell considers three possibilities.
Where s(x,y) is the match/mismatch score and d is the gap penalty. Traceback from F(m,n) recovers the optimal alignment.
The Genetic Code
The standard genetic code maps 64 possible three-nucleotide codons to 20 amino acids plus 3 stop signals.
The code is degenerate: most amino acids are encoded by 2-6 different codons. Third-position changes are often "silent" (synonymous).
Jukes-Cantor Distance
The Jukes-Cantor model corrects for multiple substitutions at the same site. Raw percent difference underestimates true evolutionary distance.
Where p is the observed proportion of different sites. The formula becomes undefined (saturated) when p reaches 0.75.
UPGMA Clustering
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) builds a rooted tree from a distance matrix using agglomerative clustering.
- Find the pair with smallest distance
- Merge them at height = distance / 2
- Update distances using average linkage
- Repeat until one cluster remains
UPGMA assumes a constant rate of evolution (molecular clock). Branch lengths are proportional to evolutionary time.