Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compressed pdf larger than original? #52

Open
eashalm opened this issue Jun 5, 2024 · 4 comments
Open

Compressed pdf larger than original? #52

eashalm opened this issue Jun 5, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@eashalm
Copy link

eashalm commented Jun 5, 2024

$ pdfly compress in.pdf out.pdf
Original Size  : 1,996,123
Compressed Size: 2,014,972 (100.9% of original)

How is this possible?

@pubpub-zz
Copy link

Please complete with test code, input file and output file
Like this, we can not do any review

@eashalm
Copy link
Author

eashalm commented Jun 5, 2024

Please complete with test code, input file and output file Like this, we can not do any review

I cannot provide the input and output files as they contain sensitive personal information. Just try it out with some PDFs on your computer and you'll see that the compress command is broken.

@JellyJoe198
Copy link

I am having the same issue with multiple pdf files.

$ pdfly compress Lockhart_2002_-_A_Mathematician\'s_Lament.pdf Lockhart_compressed.pdf
Ignoring wrong pointing object 0 0 (offset 0)
Ignoring wrong pointing object 91 0 (offset 0)
Ignoring wrong pointing object 93 0 (offset 0)
Original Size  : 400,277
Compressed Size: 418,320 (104.5% of original)

Lockhart_2002_-_A_Mathematician's_Lament.pdf
Lockhart_compressed.pdf

Another example:

$ pdfly compress Example_form.pdf Output.pdf 
Original Size  : 95,569
Compressed Size: 103,325 (108.1% of original)

Strangely, trying to compress the output of this form reduces the size, although it is still larger than the original:

$ pdfly compress Output.pdf Out2.pdf
Original Size  : 103,325
Compressed Size: 98,634 (95.5% of original)

Example_form.pdf
Output.pdf
Out2.pdf

@pubpub-zz
Copy link

these cases are possible. The compression applies a loss-less compression on streams but some other solution such as building streams of object could reduce size too. However pypdf currently has no capability to build such streams and define a strategy to compress them.
The only easy solution I could currently image would be to write the output into a stream compare size and if greater than the original just return the original file. If this sounds good to you, do not hesitate to propose a PR

@Lucas-C Lucas-C added the bug Something isn't working label Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants