Huffman

This compression method use variable bitrate (Flag Code) for fixed or variable length data. The more frequent a character or string occur the less bit flag is assign to it and the less frequent occurance will have more bit flag assigned to it. A dictionary will be built as it is progressing.

Eg: source: ABCDCBBAAA (10 bytes).

Encoding:
A = 4, B = 3, C = 2, D = 1.

Result: (dictionary) + 0101101111101010000
There are many ways to store dictionary, but the goal is the same, which way give less (efficient) space require ? one example is:
To store the character which has the highest probability first then follow by the less frequent character until finish, so we know then how to assign flag code to them later (start with sortest flag code length to the longest flag code length). In our example dictionary content should be 'ABCD' (highest to lowest).

Decoding:
1. Decode the dictionary to get each character's flag code.
2. Decode the flag code to output character

Flag Code Output Character
0 A
10 B
110 C
1110 D
110 C
10 B
10 B
0 A
0 A
0 A

Result: ABCDCBBAAA

Note: the programmer can determine how to assign bit value (Flag Code) for each character, there is no rule for it, as long as each character are different.
Pros: good result if the source is big but has only a few different characters.
Cons: bit value length is limited, so it can not have so many symbols to compress.
Conclusion: this compression is heavily depends on the frequent repeation of character, many compression programs use this method, and they only vary a little.

Arithmetic

Author Site Map Disclaimer
HMaxF Ultimate Recursive Lossless Compression Research
2001 - 2003 (c) All Rights Reserved.