Skip to content

oci: Generate composefs EROFS at pull time, track via config refs#263

Open
cgwalters wants to merge 16 commits intocomposefs:mainfrom
cgwalters:add-image-refs-for-gc
Open

oci: Generate composefs EROFS at pull time, track via config refs#263
cgwalters wants to merge 16 commits intocomposefs:mainfrom
cgwalters:add-image-refs-for-gc

Conversation

@cgwalters
Copy link
Collaborator

@cgwalters cgwalters commented Mar 14, 2026

Depends: #265


Motivation: For bootc I want to store both erofs images automatically
without needing to manually hold references.

But really what I want is that for generic OCI container images,
we want a clean model where a tag points to a manifest, which
in turn should reference everything else automatically.

With this change when pulling an OCI container image, we deafult
to generating the EROFS and reference it from the splitstream
for the config.

The next step here: bootable images, the config can be rewritten with additional refs
(e.g. "composefs.image.boot").

@cgwalters cgwalters force-pushed the add-image-refs-for-gc branch 2 times, most recently from d62471b to c141eeb Compare March 14, 2026 20:39
@cgwalters cgwalters force-pushed the add-image-refs-for-gc branch 2 times, most recently from 5d15c15 to 8876ea0 Compare March 16, 2026 22:35
Johan-Liebert1
Johan-Liebert1 previously approved these changes Mar 17, 2026
Copy link
Collaborator

@Johan-Liebert1 Johan-Liebert1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skimming through the code and trying it out, LGTM

@Johan-Liebert1 Johan-Liebert1 dismissed their stale review March 17, 2026 05:51

The merge-base changed after approval.

@Johan-Liebert1 Johan-Liebert1 force-pushed the add-image-refs-for-gc branch from 8876ea0 to 07cd372 Compare March 17, 2026 05:51
@Johan-Liebert1
Copy link
Collaborator

Rebased and pushed

Johan-Liebert1 added a commit to Johan-Liebert1/composefs-rs that referenced this pull request Mar 17, 2026
While trying out composefs#263
locally, I found that Rust is rearranging the SplitStreamHeader
struct's fields to the following

```
[32, 0, 0, 0, 0, 0, 0, 0, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0,
83, 112, 108, 105, 116, 83, 116, 114, 101, 97, 109, 0, 2, 12]
```

The first two U64 are the start and end of the field `info` when `info`
is the last field in the struct; then comes the flags which is the third
field...

We probably don't want this undeterministic behaviour, which might also
change from arch to arch

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
github-merge-queue bot pushed a commit that referenced this pull request Mar 17, 2026
While trying out #263
locally, I found that Rust is rearranging the SplitStreamHeader
struct's fields to the following

```
[32, 0, 0, 0, 0, 0, 0, 0, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0,
83, 112, 108, 105, 116, 83, 116, 114, 101, 97, 109, 0, 2, 12]
```

The first two U64 are the start and end of the field `info` when `info`
is the last field in the struct; then comes the flags which is the third
field...

We probably don't want this undeterministic behaviour, which might also
change from arch to arch

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
@cgwalters cgwalters force-pushed the add-image-refs-for-gc branch from 07cd372 to 4bd6e6c Compare March 20, 2026 12:13
cgwalters and others added 15 commits March 20, 2026 13:25
Add privileged_mount_image test that creates a composefs image and
mounts it, exercising the full overlayfs lowerdir setup. This catches
kernel compat regressions like the pre-6.15 fsconfig_set_fd bug.

Add CentOS Stream 9 to the CI matrix so all three mount compat tiers
are covered:
  - debian-bootc (6.18): default, no compat features
  - centos-bootc:stream10 (6.12): pre-6.15
  - centos-bootc:stream9 (5.14): rhel9

Pass per-distro cfsctl_features through Containerfile build-arg so
each VM builds cfsctl with the right feature for its kernel.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Enable Renovate dependency updates by inheriting the shared config
from bootc-dev/infra. See docs/SOP-new-repository.md in that repo
for details.

Assisted-by: OpenCode (claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Add cap-std and cap-tempfile as dev-dependencies to composefs and
composefs-oci for capability-scoped filesystem manipulation in tests.

Add TestRepo::path() for accessing the repository's filesystem path,
and TestRepo::dir() for getting a cap_std::fs::Dir handle scoped to
the repository root (preventing accidental path traversal in tests).

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Previously fsck was just `unimplemented!()`, now it's implemented.

We already needed fsck just on general principle, but
the motivation here is specifically as we add complex things
like OCI artifacts I want to be able to have good cross checking.

We have a `--json` output which always exits with 0.
However, I have to say I think this would be much better
with varlink as we could do incremental progress and such.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
While trying out composefs#263
locally, I found that Rust is rearranging the SplitStreamHeader
struct's fields to the following

```
[32, 0, 0, 0, 0, 0, 0, 0, 112, 0, 0, 0, 0, 0, 0, 0, 0, 0,
83, 112, 108, 105, 116, 83, 116, 114, 101, 97, 109, 0, 2, 12]
```

The first two U64 are the start and end of the field `info` when `info`
is the last field in the struct; then comes the flags which is the third
field...

We probably don't want this undeterministic behaviour, which might also
change from arch to arch

Signed-off-by: Pragyan Poudyal <pragyanpoudyal41999@gmail.com>
Synchronized from bootc-dev/infra@56e4f61.

Signed-off-by: bootc-dev Bot <bot@bootc.dev>
The composefs-dump(5) spec leaves several fields unspecified or
explicitly ignored. Canonicalize them at parse time so that parsed
entries have a single canonical representation regardless of which
implementation produced them:

- **Directory sizes**: "This is ignored for directories." Drop the
  size field from Item::Directory, always emit 0.

- **Hardlink metadata**: "We ignore all the fields except the
  payload." Zero uid/gid/mode/mtime and skip xattrs, matching the
  C parser which bails out early (mkcomposefs.c:477-491).

- **Xattr ordering**: The spec doesn't define an order. Sort
  lexicographically so output is deterministic regardless of
  on-disk ordering.

The parser still accepts any input values for backward compatibility.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
XFS limits symlink targets to 1024 bytes, and since generic Linux
containers are commonly backed by XFS, enforce that limit in both
the dumpfile parser and the EROFS reader rather than allowing up to
PATH_MAX (4096).

This also avoids exercising a known limitation in our EROFS reader
where symlink data that spills into a non-inline data block (which
can happen with long symlinks + xattrs) is not read back correctly.
See composefs/composefs#342 for the
corresponding C fix for that edge case.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Increase alignment for dumpfile generation with the composefs C
implementation - on general principle but also motivated by
the goal of reimplementing it in Rust here.

The C composefs implementation uses named escapes for backslash,
newline, carriage return, and tab (\\ \n \r \t), while our writer
was hex-escaping them uniformly (\x5c \x0a etc). Both forms parse
correctly, but byte-identical output matters for cross-implementation
comparison.

Similarly, C only escapes '=' in xattr key/value fields (where it
separates key from value). We were escaping it as \x3d in all fields
including paths and content, where '=' is a normal graphic character.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
When going over dumpfile canonicalization related to hardlinks,
it made me think about the potential for skew between EROFS
nlink on inode vs what's actually present.

This then led to: our reader should be enforcing this matches.
And while there, my agent also pointed out we could be checking `.`/`..`
among other things.

The previously unused InvalidSelfReference and InvalidParentReference
error variants are now wired up.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
The old sealing approach stored an fsverity digest in OCI config labels
(containers.composefs.fsverity) and provided seal()/mount() functions to
write and consume it. This is being replaced by EROFS image refs stored
directly in config splitstreams, which integrates with the GC model and
avoids mutating the OCI config.

Remove the seal() and mount() library functions, the seal_digest() and
is_sealed() methods on OciImage, the "sealed" field from ImageInfo, and
the corresponding Seal/Mount CLI subcommands and SEALED table column.
Also remove the now-obsolete implementation plan doc.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Describe the current OCI storage model: naming conventions for
manifest/config/layer/blob splitstreams, how tags map to refs under
streams/refs/oci/, the named_ref chains (manifest→config+layers,
config→layers), and how the GC walks from tags to objects.

Also notes the current gap: EROFS images derived from OCI content are
not referenced by any splitstream, so their lifecycle must be managed
separately.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Add test utilities for creating multi-layer OCI images from composefs
dumpfile strings. This uses the real dumpfile format parsed by
dumpfile_to_filesystem(), then walks the resulting FileSystem tree to
emit tar bytes for import_layer().

Two convenience builders with versioned boot content:
- create_base_image: 5-layer busybox-like app image
- create_bootable_image(version): 20-layer bootable OS with kernel and UKI

v1 and v2 share userspace layers (busybox, libs, systemd, configs) but
differ in kernel version (6.1.0 vs 6.2.0), initramfs, modules, and UKI.
When both are pulled into the same repo the shared layers deduplicate,
exercising GC correctness with content referenced by multiple images.

Prep for adding boot image management API.

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Motivation: For bootc I want to store *both* erofs images automatically
without needing to manually hold references.

But really what I want is that for generic OCI container images,
we want a clean model where a tag points to a manifest, which
in turn should reference everything else automatically.

With this change when pulling an OCI container image, we deafult
to generating the EROFS and reference it from the splitstream
for the config.

The next step here: bootable images, the config can be rewritten with additional refs
(e.g. "composefs.image.boot").

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>
Bootc needs both a plain EROFS image (for composefs mounts) and a
boot-transformed EROFS (with /boot emptied, SELinux labels applied).
This commit adds the bootable variant as a second named ref on the
config splitstream, using BOOT_IMAGE_REF_KEY ("composefs.image.boot")
alongside the existing IMAGE_REF_KEY ("composefs.image").

The same cascade rewrite pattern applies: adding a boot EROFS ref
rewrites config -> manifest -> tag, and GC keeps the boot EROFS
alive through the config ref chain.

CLI:
- cfsctl oci pull --bootable
- cfsctl oci mount --bootable

Assisted-by: OpenCode (Claude claude-opus-4-6)
Signed-off-by: Colin Walters <walters@verbum.org>

# Conflicts:
#	crates/composefs-oci/src/lib.rs
The open_config() return type changed from a tuple to the OpenConfig
struct. Point the bootc reverse-dep CI at bootc-dev/bootc#2044 which
has the matching API update, until that PR is merged to main.

Assisted-by: OpenCode (Claude Opus 4)
Signed-off-by: Colin Walters <walters@verbum.org>
@cgwalters cgwalters force-pushed the add-image-refs-for-gc branch 3 times, most recently from 8567826 to 37b7da3 Compare March 20, 2026 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants