Data Representation

84 Compression - Huffman Encoding

Compression - Huffmamn Coding

Watch the video and make notes. You can pause the video at any time.

Keywords 🗝️

frequency, binary, tree, compression, lossless.

Summary 📝

Huffman coding is a method used to reduce the size of data by using shorter binary codes for more common characters and longer codes for less common ones.

It starts by calculating the frequency of each character in the data, then builds a binary tree where characters with lower frequencies are placed deeper in the tree.

The binary codes are assigned by following the path from the root to each character: left for ‘0’ and right for ‘1’. This technique ensures that no code is a prefix of another, which makes decoding straightforward.

Huffman coding is a form of lossless compression, meaning the original data can be exactly recreated from the compressed version. This method is widely used in file compression programs and helps to save storage and transmission time.

Key Learning Points 📌

Huffman coding is used for lossless compression of data.
Each character is given a binary code based on how frequently it appears.
More frequent characters get shorter codes; less frequent ones get longer codes.
A binary tree is used to create the codes, starting with the least frequent items.
Codes are created by travelling: left = 0, right = 1 through the tree.
No code is a prefix of another – this makes decoding possible and accurate.
It helps reduce file size while keeping the original data intact.
Often used in text file compression and communication systems.

Link to Craig n Dave videos

Link to BBC Bitesize

Page updated

Google Sites

Report abuse