User:Xavier Combelle/7zip long range compression
Appearance
According to Dbzip2, the Wikimedia foundation is looking for a way to improve the compression speed.
The rzip experiments show, that by using the long range redundancy, a compression ratio similar to 7zip but at a faster compression rate (20x faster). A problem raised is that it would be hard to make adoption of a new compressor to dumps user.
My proposition is to integrate long range redundancy in 7zip compressor, aiming to have a similar compression rate but with compression speed higher than existent.
The idea is to add a new method in C/LzFind.c source of p7zip to search for LZmatch.
This new method would be in two part
- first a long range search which would index the whole window and look for long range similarity in a similar way than rzip
- second a close range search which index only the part of the window which is not a match in long range similarity and do the search with a classic search.
I have two questions:
- where to get the source of p7zip to play easier with wikimedia foundation ? (I would like to use my debian environment to develop it)
- What would be the simpler algorithm to adapt use for close range search and still have good enough property ?