Data Compression & Image Encoding Explorer
Type text to see Run-Length Encoding and Huffman coding in action, or paint pixels on a grid to explore how images are stored and compressed. Compare lossless and lossy methods side by side.
Controls
14 characters
Run-Length Encoding (RLE)
RLE scans the text from left to right and replaces consecutive identical characters with a count followed by the character.
Step-by-step encoding
Encoded output
3A3B2C4D2EOriginal vs Encoded
Original (14 chars, 112 bits)
AAABBBCCDDDDEEEncoded (10 chars, 80 bits)
3A3B2C4D2EOriginal Size
112 bits
14 bytes
Compressed Size
80 bits
10 bytes
Compression Ratio
1.40x
smaller
Space Savings
28.6%
saved
Compression Comparison
| Method | Original | Compressed | Ratio | Savings | Type |
|---|---|---|---|---|---|
| Uncompressed (ASCII) | 14 B | 14 B | 1.00x | 0.0% | Lossless |
| Run-Length Encoding | 14 B | 10 B | 1.40x | 28.6% | Lossless |
| Huffman Coding | 14 B | 4 B | 3.50x | 71.4% | Lossless |
Reference Guide
Run-Length Encoding
RLE is one of the simplest compression algorithms. It replaces consecutive identical symbols with a count and the symbol. For example, "AAABBBCC" becomes "3A3B2C".
RLE works best on data with long runs of repeated values, like simple graphics, fax transmissions, and binary data. It performs poorly on text with few repeats because the count digit adds overhead.
RLE is lossless, meaning the original data can be perfectly reconstructed from the compressed output.
Huffman Coding
Huffman coding assigns variable-length binary codes based on character frequency. Characters that appear more often get shorter codes. The algorithm builds a binary tree from the bottom up by repeatedly merging the two lowest-frequency nodes.
The resulting code is prefix-free, meaning no code is the beginning of another code. This makes decoding unambiguous. ASCII uses 8 bits per character regardless of frequency, while Huffman codes adapt to the input data.
Huffman coding is used inside ZIP, GZIP, JPEG, and MP3 formats.
Lossless vs Lossy
Lossless compression preserves every bit of the original data. You can decompress and get back exactly what you started with. RLE, Huffman coding, ZIP, and PNG all use lossless compression.
Lossy compression discards some data to achieve smaller file sizes. The original cannot be perfectly recovered. JPEG, MP3, and video codecs like H.264 use lossy compression.
The tradeoff is file size vs quality. Lossy compression achieves much higher compression ratios but introduces artifacts that become more noticeable at higher compression levels.
Image Color Depth
Color depth determines how many bits represent each pixel. True color uses 24 bits (8 per channel) for 16.7 million possible colors. Reducing to 8 bits gives only 256 colors, 4 bits gives 16 colors, and 1 bit gives black and white.
Indexed color takes a different approach by building a palette of unique colors used in the image, then storing each pixel as an index into that palette. If an image uses only 4 unique colors, each pixel needs just 2 bits instead of 24.
PNG uses indexed color when possible. GIF is always indexed with a maximum palette of 256 colors.