-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zip_entries_delete fails for in memory stream zip #330
Comments
I've created an MR into my local fork of this repo to illustrate some changes that may help fix the issue at prot0man@2287565 Though those changes result in
|
Actually, with the changes in my fork, it doesn't actually successfully delete the correct file (file2) based on the hexdump of the resulting zip:
|
Unfortunately, deleting entries has some limitations. I means, you cannot open the archive for writing In other words, the workflow should be like this:
look at readme example. |
So for my use case, I need to create an in memory zip and potentially
remove entries if my archive ever exceeds a size. With my modifications, it
almost works on the in memory zip, but the resulting zip clearly has
vestigial evidence of the removed entry.
It's important to note that with my example code, I'm able to create an in memory zip and remove an entry (after the changes I have in the linked branch) such that the resulting zip uncompresses to the expected file1 and file3 . If that's API abuse, it may be worth adding some checks for zip_entries_delete that assert that we have the correct mode. I'm going to attempt to see if I can find a clean work-around to enable this use case in my branch though.
I've spent the better part of the day digging in to zip_entries_delete_mark trying to find where the issue
might be, but it sounds like what I'm trying to accomplish just shouldn't
actually work by design.
As a work-around, I guess I could create a shadow zip, attempt to compress the file in it, and then determine whether the file would put me over the size limit before adding it, but doing that hurts my soul.
…On Fri, Jan 12, 2024, 3:39 PM Kuba Podgórski ***@***.***> wrote:
Unfortunately, deleting entries has some limitations. I means, you cannot
open the archive for writing w and delete entries.
You have to close it, first (zip_close) and then open for deleting (there
is also, just for consistency a special d mode for deleting).
In other words, the workflow should be like this:
zip_open
{
zip_entry_open
zip_entry_write
zip_entry_close
}
zip_close
zip_open("foo.zip", 0, 'd');
{
zip_entries_delete
// you can also delete by index, instead of by name
// zip_entries_deletebyindex
}
zip_close
look at readme example.
—
Reply to this email directly, view it on GitHub
<https://github.com/kuba--/zip/issues/330#issuecomment-1889920897>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGZBJBF657YBZ2WZQ4TRG3YOGNQLAVCNFSM6AAAAABBYM5VEOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOBZHEZDAOBZG4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
It's all about calculating index and offsets of entries. It's all has to be written to the central directory, at the end. |
Yeah, I was following through |
Generally deleting is kind of experimental feature, so I would be very careful and for sure do not mix it with writing. |
It's actually still not clear to me why in while (i < entry_num) {
while ((i < entry_num) && (entry_mark[i].type == MZ_KEEP)) {
...
}
while ((i < entry_num) && (entry_mark[i].type == MZ_DELETE)) {
...
}
while ((i < entry_num) && (entry_mark[i].type == MZ_MOVE)) {
..
}
...
} Instead of: while (i < entry_num) {
if(entry_mark[i].type == MZ_KEEP) {
...
} else if(entry_mark[i].type == MZ_DELETE) {
...
} else if (entry_mark[i].type == MZ_MOVE) {
..
}
...
} Is it some sort of optimization because doing those actions in bulk via the while loop is more efficient? After I resolved some of my technical debt with the zip file format, I realized the problem I'm running into is that the central_dir is not in sync with the file headers in some way (probably because the seeks that are done when we're dealing with a file stream need to be mirrored for zip on the heap). I suspect I will be forced to treat the mode 'w' as delete as well or else I'd have to do a disgusting work around that:
I understand that separating the operations for delete and write simplify the API because keeping everything up-to-date for both in-memory zips and file-backed zips can get complex, but are there any other compelling reasons for this design? I'll continue pushing forward in my branch and hopefully come up with a solution that doesn't require a lot of flaming hoops to jump through. |
I wonder if the issue is that in |
Nevermind, |
Alright, finally got things working in #332 I plan on doing some more thorough testing of zip_entries_delete when I get time to see if it handles the edge cases I think of. |
Thanks! If you want to add more comments, do not hesitate to reopen it, or create a new one. |
It looks like
zip_entries_delete
fails to delete files for in memory stream zips becausepState->m_pFile
is always NULL for them. Some example code to trigger the issue is:The text was updated successfully, but these errors were encountered: