-
-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ratio implementation overflows? #17
Comments
We want to reset when
So we can drop inaccurate division. Than I will use GMP to implement it. 2 multiplication with GMP for each 10000 bytes is not a big deal. I can implement it in |
I've found a small issue:
3 header bytes is not related to compression. You forget to remove them from
Main condition is a bit wrong:
I think it should not be possible for us to go into with clear code. We can just use 256:
|
I couldn't decode this "bomb":
It looks impossible. |
@vapier, Hello, could you please describe what is boff? Let's try to simplify it.
Can I just ignore this formula and continue implementing lzw? No. Let's modify code a bit.
Now let's try to compress example.
You can see 32 zero bits (4 bytes). My
You can see 64 zero bits (8 bytes) without any reason. I've found version 4.0 from 2001 year here.
So this surprise has been planted between 4.0 (2001) and 4.2.4 (2006) year. |
i don't think the code is obfuscated so much as it was written decades ago when coding standards were significantly different. advice on the original implementation is pretty much lost at this point so your guess is probably as good as mine. iirc, the gzip & ncompress had a lot of code/algorithm copied between them log ago which is why they're similar. |
@vapier, I've just finished implementing decompressor and I can confirm that I can't decompress ncompress output because of zeroes provided by formula above. I wan't ever touch this formula if it wan't provide great issue. Please use example provided above about I can fix this issue in ncompress by removing this formula away but new version of ncompress won't be compatible with previous version in production and nobody will accept such pull request. So we can't fix this issue in ncompress. We have to document this issue as known bug. We need to find a way for new software to detect that data was compressed by ncompress with broken formula. Than decompressor should skip zeroes. Compressor have to output same amount of zeroes. |
Hello. I couldn't understand ratio implementation. I think that there are some overflow issues in current implementation. Please correct me if I am wrong.
I see the main idea: we have source and destination.
Ratio equals to
source_length / destination_length
=s / d
.New ratio
(s + 2) / (d + 1)
is good,(s + 1) / (d + 2)
is bad.So we want to reset when
new_ratio < old_ratio
.We won't reset when
(s + 2) / (d + 1) > s / d
, we will reset when(s + 1) / (d + 2) < s / d
.Than we added
10000
bytes lag forsource_length
to receive more consolidated ratio.I see implementation for this algorithm in ruby in rb-compress-lzw. I have no questions about this implementation because there is a gmp library behind it, it will never overflow.
Now I am trying to read implementation in
void compress(fdin, fdout)
.I see manipulations:
int bytes_out = 0; bytes_in = 0
bytes_out += OBUFSIZ
bytes_out += (outbits+7)>>3
bytes_in += i
if (rpos > rlop) bytes_in += rpos-rlop
Lets imagine large input.
Both
bytes_in
andbytes_out
will overflow.When dictionary will be filled, we will use
bytes_in
andbytes_out
to count wrong ratio.I see same problems here.
rat = (bytes_out+(outbits>>3)) >> 8;
:bytes_out + (outbits >> 3)
can provide overflow andrat
will be invalid.How to fix it?
I have not yet invented any solution =)
The text was updated successfully, but these errors were encountered: