|
Huffman
This compression method use variable bitrate (Flag Code) for fixed or
variable length data. The more frequent a character or string occur the
less bit flag is assign to it and the less frequent occurance will have
more bit flag assigned to it. A dictionary will be built as it is progressing.
Eg: source: ABCDCBBAAA (10 bytes).
Encoding:
A = 4, B = 3, C = 2, D = 1.
Result: (dictionary) + 0101101111101010000
There are many ways to store dictionary, but the goal is the same, which
way give less (efficient) space require ? one example is:
To store the character which has the highest probability first then follow
by the less frequent character until finish, so we know then how to assign
flag code to them later (start with sortest flag code length to the longest
flag code length). In our example dictionary content should be 'ABCD'
(highest to lowest).
Decoding:
1. Decode the dictionary to get each character's flag code.
2. Decode the flag code to output character
| Flag Code |
Output Character |
| 0 |
A |
| 10 |
B |
| 110 |
C |
| 1110 |
D |
| 110 |
C |
| 10 |
B |
| 10 |
B |
| 0 |
A |
| 0 |
A |
| 0 |
A |
Result: ABCDCBBAAA
Note: the programmer can determine how to assign bit value (Flag Code)
for each character, there is no rule for it, as long as each character
are different.
Pros: good result if the source is big but has only a few different characters.
Cons: bit value length is limited, so it can not have so many symbols
to compress.
Conclusion: this compression is heavily depends on the frequent repeation
of character, many compression programs use this method, and they only
vary a little.
|