Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce ManagedOSChangelog resource #891

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

fgiudici
Copy link
Member

@fgiudici fgiudici commented Dec 9, 2024

This PR introduces the ManagedOSChangelog new resource to allow exposing OS images changelog.
The prerequisite is to have ElementalOSVersionChannel images including changelog data, as tracked in rancher/elemental#1654.

Once a ManagedOSChangelog resource is created, the associated controller spawns a Pod and a Service, to extract the changelog data from the ManagedOSVersisonChannel container image and to expose it via the httpfy micro http server.

Then, the elemental-operator proxies request to https://${Rancher URL}/elemental/changelog/${managedosversion name}/ to the Service belonging to the related ManagedOSChangelog resources.
The pattern is similar to the one already used for the SeedImage resources.

This is a working draft PR: in order to test it, a sample channel image with changelog data is present at:
quay.io/fgiudici/elemental-channel:6.1-base

A sample ManagedOSChangelog resources:

apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSChangelog
metadata:
  name: changelog
  namespace: fleet-default
spec:
  channelRef:
    name: sl-micro-6.1-base
    namespace: fleet-default
  osVersion: base-v2.2.0-3.12-os

Signed-off-by: Francesco Giudici <[email protected]>
Signed-off-by: Francesco Giudici <[email protected]>
config/crd/kustomization.yaml <-- add managedoschangelogs yaml
make build-crds

Signed-off-by: Francesco Giudici <[email protected]>
Signed-off-by: Francesco Giudici <[email protected]>
Signed-off-by: Francesco Giudici <[email protected]>
this will allow to display the changelog using the Rancher UI URL:

https://<RancherURL>/elemental/changelog/<ManagedOSVersion.Name>

which is set in the ManagedOSChangelog.status.changelogURL

Signed-off-by: Francesco Giudici <[email protected]>
@fgiudici fgiudici added status/do not merge Do not merge! spike Some research required labels Dec 9, 2024
@fgiudici fgiudici requested a review from a team as a code owner December 9, 2024 06:57
@github-actions github-actions bot added area/operator operator related changes area/tests test related changes labels Dec 9, 2024
@kkaempf
Copy link
Contributor

kkaempf commented Dec 9, 2024

Not sure about

request to https://${Rancher URL}/elemental/changelog/${managedosversion name}/

Didn't we talk about computed change logs, based on "old" vs "new" versions ?

@anmazzotti
Copy link
Contributor

| based on "old" vs "new" versions ?

The idea is still to base it on "old" vs "new". As a user you can check on one ManagedOSVersion changelog, and the assumption is the changelog is "from the previous image within this channel".

I think this also implies that all images should be kept in the channel, not limit to the last n (3) like we do right now, for clarity. However this would also imply that a channel will contain hundreds of versions, since we can rebuild multiple times a week. Lastly it can also be that some images will have no changelog, for example if a new images was rebuild with some recompilation only of packages, but no changelog entry was added due to it (maybe because it did not involve any CVE etc.). Not for this PR however, this mostly depends on how we maintain/populate the channels, and to what criteria OBS generates the updateinfo.

@anmazzotti
Copy link
Contributor

@fgiudici is this correct that it spawns a pod per-version? What I thought initially is that we could have a pod per channel, the pod should always be running (with constrained cpu/memory limits), and probably be re-sync/restarted together with the channel sync, to pick on updates.

To the point that we could move the channel sync logic to the pod itself and let it handle all the ManagedOSVersion itself, on top of offering the optional http endpoint to retrieve changelogs.

If the pod can offer changelogs can keep running after syncing the ManagedOSVersions, otherwise it can shut down and waiting for the next sync time. The pod status and a feature toggle etc. can just be added to the ManagedOSVersionChannel resource, I wouldn't create a new one.

What worries me the most is that users will have to wait for pod starting time, this is acceptable for building SeedImages, but I wouldn't want to wait to read a changelog. The other problem is that if a user is very curious and looking at all the changelogs in multiple channels, they may spawn a huge number of pods.

@fgiudici
Copy link
Member Author

fgiudici commented Dec 9, 2024

This is just a PoC (a working one, which was useful to experiment).
@anmazzotti , yep, currently it spawns a pod per changelog. I don't think this is something we should do or merge.
Anyway, this followed the pre-computed per image OS changelog we decided to put in the channel image (built on top of the PoC from rancher/elemental#1654). So @kkaempf , the changelog new vs old is already precomputed in the channel for each releases OS image.
If we follow the worker Pod path, I think makes sense the idea of David to compute the changelog on the fly starting from a single set of changelog data (since we would need a worker Pod in any case, so why do the computation twice?).
Agreed to have one Pod for channel, would make more sense, and to avoid introducing a new resource and expand the channel image instead (but for the PoC I preferred to keep it separate: the OS channel controller will become more complex and the integration of the new feature should be done properly to keep it easy to maintain, not something you want for experimenting with a PoC).
We should make our final call on how the changelog feature should look like still: at least define clearly the format and the behavior.
I would even consider keeping the changelog out of the operator and pre-compute and publish it out of the Rancher cluster (SUSE Docs?). Something we could discuss in the next few days.

Copy link
Contributor

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgiudici is this correct that it spawns a pod per-version? What I thought initially is that we could have a pod per channel, the pod should always be running (with constrained cpu/memory limits), and probably be re-sync/restarted together with the channel sync, to pick on updates.

Yes this was also my idea, so having a deployed service per active channel. Anyway I do not have a strong opinion on what's better, I'd essentially try to keep it as simple as possible.

Also having a resource per query looks a bit overkill to me, how is the resource cleanup handled then? I don't think we should expect the user the clean it up manually.

LifetimeMinutes int32 `json:"cleanupAfterMinutes"`
// Refresh triggers to build again a cleaned up changelog.
// +optional
Refresh bool `json:"refresh"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be part of the spec. IMHO this belongs to the state and should be handled there. I know subresources update via kubectl is still a beta feature and expected to be fully supported starting from v1.31, but I think it would make more sense to track it as a condition.
I am not sure having some user input being manipulated and updated in spec by the controller is good pattern. This is prone to cache miss errors and issues during backup & restore.

logger.V(5).Info("Sync WorkerPodReady condition from worker Pod", "pod-phase", pod.Status.Phase)

switch pod.Status.Phase {
case corev1.PodPending:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit unrelated to this specific feature, but I'd also love to see all this logic shared and restructured across the seedimage, channel and any other resource re. Would be nice to have consistent utilities for managing a pod life cycle. I am also wondering if couldn't we manage and abstract all this as k8s jobs to make lifecycle management simpler.

@fgiudici
Copy link
Member Author

Tried to capture the notes here and the discussion outcome to the original issue: #890

@fgiudici fgiudici marked this pull request as draft December 10, 2024 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operator operator related changes area/tests test related changes spike Some research required status/do not merge Do not merge!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants