Introduce ManagedOSChangelog resource #891

fgiudici · 2024-12-09T06:57:07Z

This PR introduces the ManagedOSChangelog new resource to allow exposing OS images changelog.
The prerequisite is to have ElementalOSVersionChannel images including changelog data, as tracked in rancher/elemental#1654.

Once a ManagedOSChangelog resource is created, the associated controller spawns a Pod and a Service, to extract the changelog data from the ManagedOSVersisonChannel container image and to expose it via the httpfy micro http server.

Then, the elemental-operator proxies request to https://${Rancher URL}/elemental/changelog/${managedosversion name}/ to the Service belonging to the related ManagedOSChangelog resources.
The pattern is similar to the one already used for the SeedImage resources.

This is a working draft PR: in order to test it, a sample channel image with changelog data is present at:
quay.io/fgiudici/elemental-channel:6.1-base

A sample ManagedOSChangelog resources:

apiVersion: elemental.cattle.io/v1beta1
kind: ManagedOSChangelog
metadata:
  name: changelog
  namespace: fleet-default
spec:
  channelRef:
    name: sl-micro-6.1-base
    namespace: fleet-default
  osVersion: base-v2.2.0-3.12-os

Signed-off-by: Francesco Giudici <[email protected]>

config/crd/kustomization.yaml <-- add managedoschangelogs yaml make build-crds Signed-off-by: Francesco Giudici <[email protected]>

Signed-off-by: Francesco Giudici <[email protected]>

this will allow to display the changelog using the Rancher UI URL: https://<RancherURL>/elemental/changelog/<ManagedOSVersion.Name> which is set in the ManagedOSChangelog.status.changelogURL Signed-off-by: Francesco Giudici <[email protected]>

kkaempf · 2024-12-09T13:37:43Z

Not sure about

request to https://${Rancher URL}/elemental/changelog/${managedosversion name}/

Didn't we talk about computed change logs, based on "old" vs "new" versions ?

anmazzotti · 2024-12-09T14:26:28Z

| based on "old" vs "new" versions ?

The idea is still to base it on "old" vs "new". As a user you can check on one ManagedOSVersion changelog, and the assumption is the changelog is "from the previous image within this channel".

I think this also implies that all images should be kept in the channel, not limit to the last n (3) like we do right now, for clarity. However this would also imply that a channel will contain hundreds of versions, since we can rebuild multiple times a week. Lastly it can also be that some images will have no changelog, for example if a new images was rebuild with some recompilation only of packages, but no changelog entry was added due to it (maybe because it did not involve any CVE etc.). Not for this PR however, this mostly depends on how we maintain/populate the channels, and to what criteria OBS generates the updateinfo.

anmazzotti · 2024-12-09T14:38:32Z

@fgiudici is this correct that it spawns a pod per-version? What I thought initially is that we could have a pod per channel, the pod should always be running (with constrained cpu/memory limits), and probably be re-sync/restarted together with the channel sync, to pick on updates.

To the point that we could move the channel sync logic to the pod itself and let it handle all the ManagedOSVersion itself, on top of offering the optional http endpoint to retrieve changelogs.

If the pod can offer changelogs can keep running after syncing the ManagedOSVersions, otherwise it can shut down and waiting for the next sync time. The pod status and a feature toggle etc. can just be added to the ManagedOSVersionChannel resource, I wouldn't create a new one.

What worries me the most is that users will have to wait for pod starting time, this is acceptable for building SeedImages, but I wouldn't want to wait to read a changelog. The other problem is that if a user is very curious and looking at all the changelogs in multiple channels, they may spawn a huge number of pods.

fgiudici · 2024-12-09T15:22:24Z

This is just a PoC (a working one, which was useful to experiment).
@anmazzotti , yep, currently it spawns a pod per changelog. I don't think this is something we should do or merge.
Anyway, this followed the pre-computed per image OS changelog we decided to put in the channel image (built on top of the PoC from rancher/elemental#1654). So @kkaempf , the changelog new vs old is already precomputed in the channel for each releases OS image.
If we follow the worker Pod path, I think makes sense the idea of David to compute the changelog on the fly starting from a single set of changelog data (since we would need a worker Pod in any case, so why do the computation twice?).
Agreed to have one Pod for channel, would make more sense, and to avoid introducing a new resource and expand the channel image instead (but for the PoC I preferred to keep it separate: the OS channel controller will become more complex and the integration of the new feature should be done properly to keep it easy to maintain, not something you want for experimenting with a PoC).
We should make our final call on how the changelog feature should look like still: at least define clearly the format and the behavior.
I would even consider keeping the changelog out of the operator and pre-compute and publish it out of the Rancher cluster (SUSE Docs?). Something we could discuss in the next few days.

davidcassany

@fgiudici is this correct that it spawns a pod per-version? What I thought initially is that we could have a pod per channel, the pod should always be running (with constrained cpu/memory limits), and probably be re-sync/restarted together with the channel sync, to pick on updates.

Yes this was also my idea, so having a deployed service per active channel. Anyway I do not have a strong opinion on what's better, I'd essentially try to keep it as simple as possible.

Also having a resource per query looks a bit overkill to me, how is the resource cleanup handled then? I don't think we should expect the user the clean it up manually.

davidcassany · 2024-12-10T08:46:00Z

api/v1beta1/managedoschangelog_types.go

+	LifetimeMinutes int32 `json:"cleanupAfterMinutes"`
+	// Refresh triggers to build again a cleaned up changelog.
+	// +optional
+	Refresh bool `json:"refresh"`


I don't think this should be part of the spec. IMHO this belongs to the state and should be handled there. I know subresources update via kubectl is still a beta feature and expected to be fully supported starting from v1.31, but I think it would make more sense to track it as a condition.
I am not sure having some user input being manipulated and updated in spec by the controller is good pattern. This is prone to cache miss errors and issues during backup & restore.

davidcassany · 2024-12-10T08:54:14Z

controllers/managedoschangelog_controller.go

+	logger.V(5).Info("Sync WorkerPodReady condition from worker Pod", "pod-phase", pod.Status.Phase)
+
+	switch pod.Status.Phase {
+	case corev1.PodPending:


A bit unrelated to this specific feature, but I'd also love to see all this logic shared and restructured across the seedimage, channel and any other resource re. Would be nice to have consistent utilities for managing a pod life cycle. I am also wondering if couldn't we manage and abstract all this as k8s jobs to make lifecycle management simpler.

fgiudici · 2024-12-10T11:27:22Z

Tried to capture the notes here and the discussion outcome to the original issue: #890

fgiudici added 8 commits December 6, 2024 16:32

operator: add ManagedOSChangelog CRD

3e1019f

Signed-off-by: Francesco Giudici <[email protected]>

operator: add ManagedOSChangelog controller

1e52b0a

Signed-off-by: Francesco Giudici <[email protected]>

make build-manifest

fef0f09

Signed-off-by: Francesco Giudici <[email protected]>

ManagedOSChangelog: add CRD to kustomize file

c1baed6

config/crd/kustomization.yaml <-- add managedoschangelogs yaml make build-crds Signed-off-by: Francesco Giudici <[email protected]>

managedoschangelog: implement controller logic

c0b588c

Signed-off-by: Francesco Giudici <[email protected]>

make generate

db73931

Signed-off-by: Francesco Giudici <[email protected]>

make build-manifests

07c8bd4

Signed-off-by: Francesco Giudici <[email protected]>

fgiudici added status/do not merge Do not merge! spike Some research required labels Dec 9, 2024

fgiudici requested a review from a team as a code owner December 9, 2024 06:57

github-actions bot added area/operator operator related changes area/tests test related changes labels Dec 9, 2024

davidcassany reviewed Dec 10, 2024

View reviewed changes

fgiudici mentioned this pull request Dec 10, 2024

Expose service to display OS images changelog in elemental-operator controllers #890

Open

fgiudici marked this pull request as draft December 10, 2024 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce ManagedOSChangelog resource #891

Introduce ManagedOSChangelog resource #891

fgiudici commented Dec 9, 2024

kkaempf commented Dec 9, 2024

anmazzotti commented Dec 9, 2024

anmazzotti commented Dec 9, 2024

fgiudici commented Dec 9, 2024 •

edited

Loading

davidcassany left a comment

davidcassany Dec 10, 2024

davidcassany Dec 10, 2024

fgiudici commented Dec 10, 2024

Introduce ManagedOSChangelog resource #891

Are you sure you want to change the base?

Introduce ManagedOSChangelog resource #891

Conversation

fgiudici commented Dec 9, 2024

kkaempf commented Dec 9, 2024

anmazzotti commented Dec 9, 2024

anmazzotti commented Dec 9, 2024

fgiudici commented Dec 9, 2024 • edited Loading

davidcassany left a comment

Choose a reason for hiding this comment

davidcassany Dec 10, 2024

Choose a reason for hiding this comment

davidcassany Dec 10, 2024

Choose a reason for hiding this comment

fgiudici commented Dec 10, 2024

fgiudici commented Dec 9, 2024 •

edited

Loading