You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As part of the bootable containers effort, users are expected to easily be able to derive from FCOS and add their own layers. This introduces a strong tension between our current update model and users building and rolling out their own OS images.
First, we expect most FCOS users will not be deriving. This is still the "happy path" and we still want e.g. phased rollouts and barriers for those. Users that do want to derive can do so but must be aware that (1) they're responsible for the reliability of their updates, and (2) they need to keep nodes up to date to not run into barrier-related issues.
That said, we also want deriving users to be able to benefit from the same update features FCOS enjoys. This makes the most sense to deriving users who may have a whole fleet of nodes and e.g. want phased rollouts or may want "barriers" of their own. (Though even the single node case could benefit from it; more below.) However, the requirement to host a Cincinnati server is an huge impediment. Yet, we don't strictly need it to be able to achieve those same features. Even for FCOS, not having it would also lower the maintenance burden.
The proposal then is to:
Come up with a new "v2 update graph" schema that provides all the needed metadata (this likely looks like a much simpler JSON/YAML file than the current schema).
Write user-facing tooling to create/manage/push v2 update graphs. Use this tooling to start also publishing FCOS' graph as an OCI artifact in e.g. quay.io/fedora/fedora-coreos.
Teach Zincati to consume v2 update graphs e.g. behind a knob; this notably includes moving rollout logic from Cincinnati (server-side) to Zincati (i.e. client-side)
Switch FCOS over to use v2 update graphs. Stop publishing new updates in the Cincinnati graph (but we'll need to keep it online for a while).
Add documentation for users that want to build derived container images to either (1) not worry about update graphs, but be aware of the gotchas, or (2) use our tooling to build their own update graphs. Because it's just an OCI artifact, it can be built and pushed to the same OCI repo users push their derivation to, from the same CI pipeline.
The tool should also support "inheriting" from the canonical update graph, so then it could include upstream barriers.
The text was updated successfully, but these errors were encountered:
In general, as long as images are rebuilt promptly and rolled out to nodes frequently, there shouldn't be any issues. Some cases that might cause issues are:
Barriers: if we emit a barrier release and very quickly do another release after it, users may not have rebuilt their derived containers in time or had time to roll it out completely. In practice, this is quite rare. We should be able to avoid having to do this (and e.g. require at least 1 week between two such releases), or if we really do have to do this, then send out a status post.
Deadends: if we do a release and shortly after rollback the release and deadend it, users may have already built derived containers on top of it and rolled it out. This is not actually specific to deriving users. Even those not deriving may enter deadends. In practice, this is also very rare but does have a large impact when it does. There are no deadend releases on stable. There is one deadend release on testing (the very first one, on f30) and next each. One thing we'll want to make sure going forward is that we also rollback the Quay.io tags when rolling back a release.
As part of the bootable containers effort, users are expected to easily be able to derive from FCOS and add their own layers. This introduces a strong tension between our current update model and users building and rolling out their own OS images.
First, we expect most FCOS users will not be deriving. This is still the "happy path" and we still want e.g. phased rollouts and barriers for those. Users that do want to derive can do so but must be aware that (1) they're responsible for the reliability of their updates, and (2) they need to keep nodes up to date to not run into barrier-related issues.
That said, we also want deriving users to be able to benefit from the same update features FCOS enjoys. This makes the most sense to deriving users who may have a whole fleet of nodes and e.g. want phased rollouts or may want "barriers" of their own. (Though even the single node case could benefit from it; more below.) However, the requirement to host a Cincinnati server is an huge impediment. Yet, we don't strictly need it to be able to achieve those same features. Even for FCOS, not having it would also lower the maintenance burden.
The proposal then is to:
quay.io/fedora/fedora-coreos
.The tool should also support "inheriting" from the canonical update graph, so then it could include upstream barriers.
The text was updated successfully, but these errors were encountered: