Sliding Window (sliding dictionary)

This method use a fixed length dictionary size to compress and the dictionary content is changing (FIFO - First In First Out) in cycle as it progressing. The size of the dictionary can be determined at the design time or at the run time (first have to go through source content at least once before determine the best size) that is depend on how the programmer want to achieve to get better result (smaller result size).

Encoding:
Eg: dictionary size is 8 bytes.
Code:
(S) = Store as is.
(Lxxx) = Length (xxx = total bytes).
(CC) = Compression Code.
(SPxxx) = Start Position (xxx = start at the position of dictionary)
Source: ABCDEFGHABCDEFGHJKLABCDEFGH

Input Dictionary
(8 bytes long)
Code Output
ABCDEFGH (empty / first run) (S) (S)(L8) ABCDEFGH
ABCDEFGH ABCDEFGH (CC) (CC)(SP0)(L8) (none - just Code)
JKLABCDEFGH ABCDEFGH (S) (S)(L11) JKLABCDEFGH

Result: (S)(L8) ABCDEFGH (CC)(SP0)(L8)(S)(L11) JKLABCDEFGH

Note: every operation has a flag code to indicate whether it is a compression (CC) or just store as is (S). in case of store as is (S) it followed by a length to indicate the total byte stored as is and compression (CC) will be followed by a location (in dictionary) to start then a length to indicate how many bytes to copy.

Decoding:
Input: (S)(L8) ABCDEFGH (CC0)(L8)(S)(L11) JKLABCDEFGH

Input Code Dictionary Output
(S)(L8) (none) ABCDEFGH
(CC)(SP0) ABCDEFGH ABCDEFGH
(S)(L11) ABCDEFGH JKLABCDEFGH

Result: ABCDEFGHABCDEFGHJKLABCDEFGH
Pros: do not have to build dictionary (dictionary is the previous content)
Cons: it can only compress if the string within dictionary, the example above shows the last 'abcdefgh' should be able to be compressed but the dictionary size does not allow.
Conclusion: the dictionary size is very important and it is quite hard to determine, but programmer can make a variable length dictionary size. Below is a table shows advantage and disadvantage of sort and long dictionary size:

  Sort (small) dictionary size Long (big) dictionary size
Advantage 1. easier to maintain.
2. faster, requires less searching.
3. Use less byte (resource)
for flag code.
1. may get better (smaller) result file.
Disadvantage 1. sometimes the string is outside the dictionary size, resulting a not so good result file. 1. slow, need more computing resource for dictionary searching .
2. may require more memory space to put dictionary (depend on its location)

 

Huffman

Author Site Map Disclaimer
HMaxF Ultimate Recursive Lossless Compression Research
2001 - 2003 (c) All Rights Reserved.