Skip to content

Conversation

@OwenCorrigan76
Copy link
Contributor

This PR migrates the Java Operator SDK from v4.4.2 to v5.1.2. The updated vesrion will allow Access Operator to watch Kafka clusters in a remote Kubernetes cluster, as requested in issue #44.

This PR directly addresses the follwoing issue: #89

@OwenCorrigan76 OwenCorrigan76 added this to the 0.2.0 milestone Sep 24, 2025
@OwenCorrigan76 OwenCorrigan76 self-assigned this Sep 24, 2025
Copy link
Member

@im-konge im-konge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, few comments.

@OwenCorrigan76 OwenCorrigan76 force-pushed the Migrate_JOSDK_to_v5 branch 2 times, most recently from ec144f7 to a7b7659 Compare September 25, 2025 11:15
Copy link
Member

@katheris katheris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @OwenCorrigan76. I think we need some additional changes for the switch to SSA and I don't think the Kafka Access Secret event source is being used correctly. Also from reviewing the migration docs I think we need to change the way we use the SDK generation to use the maven plugin.

@katheris katheris modified the milestones: 0.2.0, 0.3.0 Oct 2, 2025
@OwenCorrigan76 OwenCorrigan76 force-pushed the Migrate_JOSDK_to_v5 branch 2 times, most recently from ceca15e to dedd7b5 Compare October 10, 2025 10:59
@OwenCorrigan76
Copy link
Contributor Author

OwenCorrigan76 commented Oct 14, 2025

@katheris @im-konge I've reverted the SSA stuff in CreateOrUpdateSecret and tests are passing and I've tested changes in a cluster and working as expected.
Do we need to revert these changes:

@im-konge
Copy link
Member

@OwenCorrigan76 about the unused declared dependency -> you removed the ObservedGenerationAwareStatus from the KafkaAccessStatus so I think it's not needed anymore in the api module. Did you try to remove it?

@OwenCorrigan76
Copy link
Contributor Author

@OwenCorrigan76 about the unused declared dependency -> you removed the ObservedGenerationAwareStatus from the KafkaAccessStatus so I think it's not needed anymore in the api module. Did you try to remove it?

That's exactly the reason. Thanks @im-konge

Copy link
Member

@im-konge im-konge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, few more comments.

Copy link
Member

@katheris katheris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @OwenCorrigan76 I think we're pretty close now, but a couple more changes needed.

Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
Signed-off-by: OwenCorrigan76 <[email protected]>
@OwenCorrigan76
Copy link
Contributor Author

OwenCorrigan76 commented Oct 22, 2025

@katheris @im-konge When I have the SSA flag in the operator initialization, I get the following error when deploying the KafkAccess:

 k logs  strimzi-access-operator-66d67969fb-6kxws
2025-10-22 15:44:04 INFO  KafkaAccessOperator:28 - Kafka Access operator starting
2025-10-22 15:44:05 WARN  Default ConfigurationService implementation:152 - Configuration for reconciler 'kafkaaccessreconciler' was not found. Known reconcilers: None.
2025-10-22 15:44:05 INFO  Default ConfigurationService implementation:177 - Created configuration for reconciler io.strimzi.kafka.access.KafkaAccessReconciler with name kafkaaccessreconciler
2025-10-22 15:44:05 INFO  KafkaAccessReconciler:175 - Preparing event sources
2025-10-22 15:44:05 INFO  KafkaAccessReconciler:204 - Finished preparing event sources
2025-10-22 15:44:05 INFO  Operator:222 - Registered reconciler: 'kafkaaccessreconciler' for resource: 'class io.strimzi.kafka.access.model.KafkaAccess' for namespace(s): [all namespaces]
2025-10-22 15:44:05 INFO  Operator:124 - Operator SDK 5.1.2 (commit: a942970) built on 2025-08-05T12:06:43.000+0000 starting...
2025-10-22 15:44:05 INFO  Operator:130 - Client version: 7.2.0
2025-10-22 15:44:05 INFO  Controller:342 - Starting 'kafkaaccessreconciler' controller for reconciler: io.strimzi.kafka.access.KafkaAccessReconciler, resource: io.strimzi.kafka.access.model.KafkaAccess
2025-10-22 15:44:05 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkaaccesses' with unstable version 'v1alpha1'
2025-10-22 15:44:06 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkausers' with unstable version 'v1beta2'
2025-10-22 15:44:06 WARN  VersionUsageUtils:60 - The client is using resource type 'kafkas' with unstable version 'v1beta2'
2025-10-22 15:44:08 INFO  Controller:357 - 'kafkaaccessreconciler' controller started
2025-10-22 15:44:08 INFO  Server:384 - jetty-11.0.24; built: 2024-08-26T18:11:22.448Z; git: 5dfc59a691b748796f922208956bd1f2794bcd16; jvm 17.0.16+8-LTS
2025-10-22 15:44:08 INFO  AbstractConnector:376 - Started ServerConnector@5fe8b721{HTTP/1.1, (http/1.1)}{0.0.0.0:8080}
2025-10-22 15:44:08 INFO  Server:439 - Started Server@2d72f75e{STARTING}[11.0.24,sto=0] @5082ms
2025-10-22 15:44:08 INFO  KafkaAccessOperator:40 - Kafka Access operator is now ready (health server listening)
2025-10-22 15:45:08 INFO  KafkaAccessReconciler:82 - Reconciling KafkaAccess myproject/my-kafka-access
2025-10-22 15:45:08 ERROR ReconciliationDispatcher:218 - updateErrorStatus failed for resource: 7ae186f7-ee30-48e0-a549-ad0a6d375175 with version: 16432 for error Failure executing: PATCH at: https://10.96.0.1:443/apis/access.strimzi.io/v1alpha1/namespaces/myproject/kafkaaccesses/my-kafka-access/status. Message: the server rejected our request due to an error in our request. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1:443/apis/access.strimzi.io/v1alpha1/namespaces/myproject/kafkaaccesses/my-kafka-access/status. Message: the server rejected our request due to an error in our request. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:205) ~[io.fabric8.kubernetes-client-api-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:507) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:419) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:397) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handlePatch(BaseOperation.java:764) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:231) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:236) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.editStatus(HasMetadataOperation.java:76) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.editStatus(HasMetadataOperation.java:44) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher$CustomResourceFacade.editStatus(ReconciliationDispatcher.java:485) ~[io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher$CustomResourceFacade.patchStatus(ReconciliationDispatcher.java:472) ~[io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleErrorStatusHandler(ReconciliationDispatcher.java:215) [io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:132) [io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:97) [io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:68) [io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:475) [io.javaoperatorsdk.operator-framework-core-5.1.2.jar:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: PATCH at: https://10.96.0.1:443/apis/access.strimzi.io/v1alpha1/namespaces/myproject/kafkaaccesses/my-kafka-access/status. Message: the server rejected our request due to an error in our request. Received status: Status(apiVersion=v1, code=422, details=StatusDetails(causes=[], group=null, kind=null, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=the server rejected our request due to an error in our request, metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Invalid, status=Failure, additionalProperties={}).
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:642) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.requestFailure(OperationSupport.java:622) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.assertResponseCode(OperationSupport.java:582) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.lambda$handleResponse$0(OperationSupport.java:549) ~[io.fabric8.kubernetes-client-7.2.0.jar:?]
	at java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?]
	at io.fabric8.kubernetes.client.http.StandardHttpClient.lambda$completeOrCancel$10(StandardHttpClient.java:141) ~[io.fabric8.kubernetes-client-api-7.2.0.jar:?]
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147) ~[?:?]
	at io.fabric8.kubernetes.client.utils.AsyncUtils.lambda$retryWithExponentialBackoff$3(AsyncUtils.java:91) ~[io.fabric8.kubernetes-client-api-7.2.0.jar:?]
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:614) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:844) ~[?:?]
	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482) ~[?:?]
	... 1 more
2025-10-22 15:45:08 WARN  EventProcessor:362 - Uncaught error during event processing ExecutionScope{ resource id: ResourceID{name='my-kafka-access', namespace='myproject'}, version: 16432} - but another reconciliation will be attempted because a superseding event has been received or another retry attempt is pending.

I couldn't figure out where the error was even coming from until i removed the flag and went back to:

        final Operator operator = new Operator();

Now it builds and works fine. I can't see where the issue lies. Would you be able to have a look if you have some time? Thanks

@im-konge
Copy link
Member

@OwenCorrigan76 the log you shared is a bit cut. Could you share the full log possibly?

@OwenCorrigan76
Copy link
Contributor Author

@im-konge I updated the full log there now. Thanks

@im-konge
Copy link
Member

It says something about "unprocessable entity" - so possibly you have some field wrong there? How does the KafkaAccess look like? Isn't it failing because of the observedGeneration?

@im-konge
Copy link
Member

Also you can for example check how does the patch actually look like, what you are updating there etc.
It's not easy to say what is the issue from that log.

@OwenCorrigan76
Copy link
Contributor Author

OwenCorrigan76 commented Oct 23, 2025

The KafkaAccess looks like the following:

apiVersion: access.strimzi.io/v1alpha1
kind: KafkaAccess
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"access.strimzi.io/v1alpha1","kind":"KafkaAccess","metadata":{"annotations":{},"name":"my-kafka-access","namespace":"myproject"},"spec":{"kafka":{"listener":"plain","name":"my-cluster","namespace":"myproject"},"secretName":"owen-my-kafka-access-secret"}}
  creationTimestamp: "2025-10-23T13:14:52Z"
  generation: 1
  name: my-kafka-access
  namespace: myproject
  resourceVersion: "1009"
  uid: c34052fb-f4eb-4bde-8577-8c8beb691f64
spec:
  kafka:
    listener: plain
    name: my-cluster
    namespace: myproject
  secretName: my-kafka-access-secret
                                                                                                                     "/var/folders/n8/t7rtvrf57b1f0ymyg_pvyx400000gn/T/kubectl-edit-3043049247.yaml" 22L, 921B

These are the two points where we are using it [1] [2].

In v4.x we used
return UpdateControl.updateStatus(kafkaAccess);
and
return ErrorStatusUpdateControl.patchStatus(kafkaAccess);

In v5.x updateStatus is deprecated.

[1] [2]

@OwenCorrigan76
Copy link
Contributor Author

Also, I treid removing observedGeneration but still had the same issue. As I mentioned, once I remove the SSA flag in the operator, everything seems to work fine.

@im-konge
Copy link
Member

What SSA flag you mean?

@im-konge
Copy link
Member

If you have some SSA flag set there, which configures the operator to use SSA, we discussed that you should disable that and handle the SSA in different PR. Maybe I'm blind but I don't see any flag like that in the code. Is it happening in the unit tests? Is it happening when you deploy the operator on Kube?

@OwenCorrigan76
Copy link
Contributor Author

OwenCorrigan76 commented Oct 23, 2025

Sorry @im-konge. I removed it in the last commit to see would it builld and pass the tests (which it did). This is the flag that is breaking everything. And the flag is used to explicity NOT use SSA as discussed here.
Unit tests are all passing. These errors are when I deploy in Minikube.

final Operator operator = new Operator(overrider -> overrider
.withUseSSAToPatchPrimaryResource(false));

@im-konge
Copy link
Member

Sorry @im-konge. I removed it in the last commit to see would it builld and pass the tests (which it did). This is the flag that is breaking everything. And the flag is used to explicity NOT use SSA as discussed here. Unit tests are all passing. These errors are when I deploy in Minikube.

final Operator operator = new Operator(overrider -> overrider
.withUseSSAToPatchPrimaryResource(false));

So that means it's using the SSA right? It should be IMO consistent in the main class and in the test class as well, otherwise you are testing and running two different things and two different behaviors.
You should return it back and check why the patch actually doesn't work with the SSA disabled (so old approach as we have today). Maybe it's related to Kate's comment about building the whole KafkaAccessStatus from here I guess - #93 (comment).

Signed-off-by: OwenCorrigan76 <[email protected]>
@OwenCorrigan76
Copy link
Contributor Author

@im-konge Yes I agree. I have reverted it to have the the flag with SSA (false) ,and will continue to investigate why it is failing.
I had changes that applied that Kate suggested above but when removing SSA, I reverted those changes too. I'll look at it again.
Thanks for taking the time and investigating with me.

@im-konge
Copy link
Member

@im-konge Yes I agree. I have reverted it to have the the flag with SSA (false) ,and will continue to investigate why it is failing. I had changes that applied that Kate suggested above but when removing SSA, I reverted those changes too. I'll look at it again. Thanks for taking the time and investigating with me.

No worries, I think that the building the status from scratch will help. Let's see if it will work (I checked the branch and there is no other way - just the patch 😄 so we need to fix it).

@katheris
Copy link
Member

Hey @OwenCorrigan76 I took a look at the error you're getting and I think it is related to the change to use the UpdateControl.patchStatus method. The Javadoc indicates that you can't use this if you implemented initStatus which we have. However since the reconcile method always calls patchStatus I don't think we actually need to have implemented initStatus.

So I tested with your branch and removing the initStatus method from KafkaAccess.java seems to resolve the problem and as far as I can tell has no other implications. Can you test that as well?

Signed-off-by: OwenCorrigan76 <[email protected]>
@OwenCorrigan76
Copy link
Contributor Author

@katheris Thanks so much for looking at this and discovering the issue. I have tested it successfully and pushed the changes.

Copy link
Member

@im-konge im-konge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code LGTM, thanks for the changes.
Just one more question about the status issues during patch phase - I saw that you removed the init status (as it caused the exception you mentioned earlier) -> does it work fine when you update the KafkaAccess resource with different stuff, so the status section of the KafkaAccess resource will be updated once more (so from Ready to "not ready" or something like that and back to the Ready state)

@OwenCorrigan76
Copy link
Contributor Author

OwenCorrigan76 commented Oct 30, 2025

@im-konge The status is updating fine. For example when deploying with the incorrect namespace:

status:
  binding:
    name: my-kafka-access
  conditions:
  - lastTransitionTime: "2025-10-30T13:35:27.624655923Z"
    message: Kafka kafka/my-cluster missing
    reason: MissingKubernetesResource
    status: "False"
    type: Ready
  observedGeneration: 5

and then with the correct namespace:

status:
  binding:
    name: my-kafka-access
  conditions:
  - lastTransitionTime: "2025-10-30T13:36:01.532973717Z"
    message: Ready
    reason: Ready
    status: "True"
    type: Ready
  observedGeneration: 6

Is that the type of thing you mean?

@im-konge
Copy link
Member

@im-konge The status is updating fine. For example when deploying with the incorrect namespace:

status:
  binding:
    name: my-kafka-access
  conditions:
  - lastTransitionTime: "2025-10-30T13:35:27.624655923Z"
    message: Kafka kafka/my-cluster missing
    reason: MissingKubernetesResource
    status: "False"
    type: Ready
  observedGeneration: 5

and then with the correct namespace:

status:
  binding:
    name: my-kafka-access
  conditions:
  - lastTransitionTime: "2025-10-30T13:36:01.532973717Z"
    message: Ready
    reason: Ready
    status: "True"
    type: Ready
  observedGeneration: 6

Is that the type of thing you mean?

Yes, thanks

@im-konge
Copy link
Member

im-konge commented Nov 3, 2025

@katheris are we waiting for something else or we can merge this?

@katheris
Copy link
Member

katheris commented Nov 3, 2025

@im-konge I reached out to the Java Operator SDK community to see if anyone has time to review this PR. I would say let's give them until Wednesday and if no one responds to my message I'll merge it

Copy link

@xstefank xstefank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From JOSDK side, LGTM

@katheris
Copy link
Member

katheris commented Nov 5, 2025

Thanks @xstefank for giving it a look over

@katheris katheris merged commit 2289a34 into strimzi:main Nov 5, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants