Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --no-name gzip flag to compression file output #50

Closed
gwaybio opened this issue Jun 1, 2020 · 2 comments
Closed

Add --no-name gzip flag to compression file output #50

gwaybio opened this issue Jun 1, 2020 · 2 comments
Labels
enhancement New feature or request Version 2 Wishlist Items to process before a version 2 release

Comments

@gwaybio
Copy link
Member

gwaybio commented Jun 1, 2020

We are get annoying file diff triggers when reprocessing the pipeline, even if nothing changes in the file. This is important to fix so that we are able to isolate actual changes that result from reprocessing output data.

As @shntnu notes in #48 the reason why the gzip files are triggering positive diffs, is because of an added timestamp.

The way to remove the timestamp from the file is to pass a --no-name (-n) flag to the gzip command. See http://linuxcommand.org/lc3_man_pages/gzip1.html

Fortunately, it looks like pandas-dev/pandas#33398 has added the ability to include args to pandas gzip compression. This improvement will be included in pandas version 1.1, which is scheduled for an Aug 1 release.

Three Options

For the pandas or python option, the solution should ideally live in pycytominer. I've created a stub for this at cytomining/pycytominer#83

@gwaybio gwaybio added enhancement New feature or request Version 2 Wishlist Items to process before a version 2 release labels Jun 1, 2020
@gwaybio
Copy link
Member Author

gwaybio commented Apr 20, 2021

fixed in #63

@gwaybio gwaybio closed this as completed Apr 20, 2021
@shntnu
Copy link
Collaborator

shntnu commented Apr 21, 2021

Awesome!!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Version 2 Wishlist Items to process before a version 2 release
Projects
None yet
Development

No branches or pull requests

2 participants