Skip to content

fix for failing numcodecs.zarr3 codecs #3326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

terraputix
Copy link

@terraputix terraputix commented Aug 3, 2025

Closes #2900.

I did not add a unit test because it was already mentioned in #2900 that all numcodecs should be better tested in CI, but I guess that belongs into another PR.

You can verify that it works by running the following script:

#!/usr/bin/env -S uv run
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     # "zarr==3.0.2,<3.0.3", # WORKS
#     # "zarr==3.0.4", # FAILS
#     "zarr@git+https://github.com/terraputix/zarr-python.git@16bd1c7825b895d0247b08b255ffcfa214b6e150",  # WORKS
#     "numcodecs==0.15.0",
#     "zfpy==1.0.1",
#     "pcodec==0.3.2",
# ]
# ///
#


import numpy as np
import zarr
zarr.__version__ = "3.0.2"
from numcodecs.zarr3 import ZFPY, PCodec

for serializer in [
    ZFPY(mode=4, tolerance=0.01),
    PCodec(level=8, mode_spec="auto"),
]:
    array = zarr.create_array(
        store=zarr.storage.LocalStore("test"),
        shape=[2, 2],
        chunks=[2, 1],
        dtype=np.float32,
        serializer=serializer,
        compressors=None,
        overwrite=True,
    )
    array[...] = np.array([[0, 1], [2, 3]])

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Aug 3, 2025
@terraputix terraputix force-pushed the fix-pcodec-compression branch from 83413cb to 16bd1c7 Compare August 6, 2025 16:01
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.54%. Comparing base (a26926c) to head (f665b5b).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3326   +/-   ##
=======================================
  Coverage   94.54%   94.54%           
=======================================
  Files          78       78           
  Lines        9423     9423           
=======================================
  Hits         8909     8909           
  Misses        514      514           
Files with missing lines Coverage Δ
src/zarr/core/codec_pipeline.py 93.13% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@terraputix
Copy link
Author

I improved this fix, avoiding the extra copy which I introduced before.

Basically, this change reverts the codec pipeline modifications introduced in #2851 and provides an alternative fix for handling chunks at the end of the array when the chunk shape does not evenly divide the array shape.

Do you mind taking a look at this one @dcherian or @d-v-b ?

@d-v-b
Copy link
Contributor

d-v-b commented Aug 8, 2025

I had a look, but I don't know this part of the code well and that function has no comments or docstring(!), so I'm not sure how much my review is worth. If the tests pass and someone who knows the code a bit better gives it a thumbs-up (cc @dcherian) then I think we can merge.

but why are we fixing this in zarr python, instead of in the individual codecs?

@terraputix
Copy link
Author

terraputix commented Aug 8, 2025

but why are we fixing this in zarr python, instead of in the individual codecs?

I actually tried the solution proposed in #2900 (comment), but it was breaking a handful of other tests in numcodecs...

This PR is basically an improved version of #2851, which also does not require copies in downstream codecs and (I think) therefore should be preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some numcodecs.zarr3 codecs fail with 3.0.4+
2 participants