-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kafka Update to Kraft Production #4427
Conversation
📝 WalkthroughWalkthroughThis pull request updates the Kafka cluster production configuration and the operator deployment script. In the Kafka configuration file, three new PersistentVolumes are defined with unique storage paths, and a new KafkaNodePool is added with dual roles. The Kafka resource is updated by bumping its version to 3.9.0, adding a metadataVersion field, revising resource requests/limits, and removing legacy configurations such as inter.broker.protocol.version and Zookeeper settings. The operator deployment script now downloads and extracts Strimzi Kafka Operator version 0.45.0 instead of 0.42.0. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant "Deployment Script"
participant "Strimzi Server"
User->>Deployment Script: Execute deployment-script.sh
Deployment Script->>Strimzi Server: Request strimzi-0.45.0.tar.gz
Strimzi Server-->>Deployment Script: Send tarball
Deployment Script->>Deployment Script: Extract tarball
Deployment Script->>Deployment Script: Change directory to strimzi-0.45.0/
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## staging #4427 +/- ##
========================================
Coverage 11.24% 11.24%
========================================
Files 156 156
Lines 18006 18006
Branches 388 388
========================================
Hits 2025 2025
Misses 15979 15979
Partials 2 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
k8s/kafka/clusters/kafka-cluster-prod.yaml (1)
1-64
: Verify PersistentVolume paths and node availability.The PersistentVolume configuration looks good with proper node affinity and unique storage paths. However:
- Ensure the paths
/mnt/kafka-data*
exist on the high-mem nodes- Consider increasing the storage size (4Gi might be small for production)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
k8s/kafka/clusters/kafka-cluster-prod.yaml
(2 hunks)k8s/kafka/operator/deployment-script.sh
(1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
k8s/kafka/operator/deployment-script.sh
[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (javascript)
🔇 Additional comments (5)
k8s/kafka/operator/deployment-script.sh (1)
2-4
: Verify Strimzi Kafka Operator version compatibility.The upgrade from 0.42.0 to 0.45.0 is significant. Please ensure compatibility with Kafka 3.9.0 and KRaft mode.
✅ Verification successful
🌐 Web query:
What is the latest version of Strimzi Kafka Operator and which Kafka versions does it support?
💡 Result:
The latest version of the Strimzi Kafka Operator is 0.45.0, released on December 17, 2024[1][7]. This version supports Apache Kafka 3.8.x (3.8.1) and 3.9.x (3.9.0)[1][4][7]. Previous Kafka versions like 3.7.x were removed in this release[4].
Key Compatibility Details:
Component Supported Versions Notes Apache Kafka 3.8.x, 3.9.x Requires KRaft mode (ZooKeeper-based clusters deprecated)[4][10] Kubernetes 1.25+ Older Kubernetes versions (1.23-1.24) no longer supported[1][7] Strimzi Bridge 0.31.x Updated HTTP bridge component[4] OAuth Support 0.15.x Maintained from previous releases[1] Important Changes:
- Final support for ZooKeeper-based clusters - All ZooKeeper dependencies must be migrated to KRaft before upgrading beyond Strimzi 0.45[4][7][10]
- Deprecated components removed:
- MirrorMaker 1 (use MirrorMaker 2 instead)
- Legacy EnvVar Configuration Provider
- MirrorMaker 2 Extensions[4][7]
- Requires manual intervention when upgrading from Strimzi ≤0.22[7]
For new deployments, Strimzi now enforces KRaft mode by default and requires Kubernetes ≥1.25[1][10]. The container images are available via Quay.io registry using tags matching the Kafka versions (e.g.,
quay.io/strimzi/kafka:3.9.0
)[7].Citations:
- 1: https://strimzi.io/downloads/
- 2: https://github.com/strimzi/strimzi-kafka-operator/blob/main/kafka-versions.yaml
- 3: https://artifacthub.io/packages/helm/strimzi-kafka-operator/strimzi-kafka-operator/0.43.0
- 4: https://github.com/strimzi/strimzi-kafka-operator/blob/main/CHANGELOG.md
- 5: https://strimzi.io/docs/operators/0.40.0/overview
- 6: https://lists.cncf.io/g/cncf-strimzi-users/topic/rc1_of_strimzi_kafka/94652720
- 7: https://github.com/strimzi/strimzi-kafka-operator/releases
- 8: https://operatorhub.io/operator/strimzi-kafka-operator
- 9: https://strimzi.io
- 10: https://strimzi.io/docs/operators/latest/deploying
Strimzi Kafka Operator Compatibility Verified
- The Strimzi Kafka Operator version 0.45.0 is the latest release and supports both Apache Kafka 3.8.x and 3.9.x, as confirmed by the documentation.
- This release enforces KRaft mode by default, ensuring compatibility with Kafka 3.9.0 clusters when using KRaft.
- No additional compatibility issues were detected with the upgrade from 0.42.0 to 0.45.0.
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
k8s/kafka/clusters/kafka-cluster-prod.yaml (4)
66-85
: LGTM! Well-configured KRaft node pool.The KafkaNodePool configuration is well structured with:
- 3 replicas for high availability
- Combined controller and broker roles
- Proper JBOD storage configuration with KRaft metadata sharing
92-94
: LGTM! Proper KRaft mode enablement.The annotations correctly enable both node pools and KRaft mode.
134-141
: Review resource allocation for production workload.The current resource configuration might need adjustment:
- Memory request (500Mi) seems low for production Kafka brokers
- CPU limit (6000m) is quite high compared to request (100m)
Consider adjusting based on your production workload patterns:
resources: requests: - memory: "500Mi" + memory: "2Gi" - cpu: "100m" + cpu: "1000m" limits: memory: "2Gi" - cpu: "6000m" + cpu: "4000m"
97-98
: Verify Kafka version and metadata version compatibility.The upgrade to Kafka 3.9.0 with metadata version 3.9-IV0 is significant. Ensure:
- All clients support Kafka 3.9.0
- Plan for any necessary client updates
- Consider documenting rollback procedures
✅ Verification successful
🌐 Web query:
What are the breaking changes in Kafka 3.9.0 compared to previous versions?
💡 Result:
Apache Kafka 3.9 introduces several significant breaking changes and deprecations compared to earlier versions:
Core Architecture Changes
Final ZooKeeper Support
3.9 is the last major release supporting ZooKeeper-based metadata storage[4][6]. This requires administrators using ZooKeeper to complete migration to KRaft mode before upgrading beyond 3.x versions[4][7].Dynamic KRaft Controller Membership (KIP-853)
Static controller quorum configurations are replaced by dynamic membership management viakafka-metadata-quorum.sh
or AdminClient API[4][7]. Existing static configurations must be migrated to dynamic ones during upgrade.Upgrade Requirements
- Two Rolling Bounce Requirement
Upgrades from versions ≤3.4 now require two rolling restarts of brokers rather than one[7]. This impacts maintenance planning for clusters running older versions.Tiered Storage Changes
- Mandatory Configuration Updates
New tiered storage features require configuration adjustments:
- Per-topic disablement via
remote.storage.disable
[4]- Quota enforcement through
remote.log.reader.bps
/remote.log.writer.bps
[4]- Metadata handling changes in
kafka-dump-log.sh
tool[4]Client/API Changes
- Consumer Group Behavior
Modified consumer close behavior through newleaveGroup
parameter in Consumer API (KIP-1092)[2], requiring client code adjustments for explicit group departure control.Streams and Connect
- MirrorMaker2 Configuration Breaking Changes
Newheartbeats.replication.enabled
parameter defaults differ from previous implicit behavior[2][4]. Existing mirroring setups using heartbeat topics need explicit configuration review.Deprecation Timeline Enforcement
- All deprecated ZooKeeper-related APIs/configuration parameters now generate fatal errors rather than warnings[4][10]. This includes:
zookeeper.connect
configuration- ZK-based ACL management tools
- Legacy migration utilities
These changes require careful planning for upgrades from pre-3.x versions due to architectural shifts in metadata management and removal of legacy components[1][7][10]. Administrators should validate KRaft configurations and test tiered storage implementations before production deployment[4][6].
Citations:
- 1: https://github.com/strimzi/strimzi-kafka-operator/blob/main/CHANGELOG.md
- 2: https://developers.redhat.com/blog/2024/10/01/kafka-monthly-digest-september-2024
- 3: https://kroxylicious.io/blog/kroxylicious/releases/2024/12/17/release-0_9_0.html
- 4: https://www.confluent.io/blog/introducing-apache-kafka-3-9/
- 5: https://kafka.apache.org/39/documentation/
- 6: https://developers.redhat.com/blog/2025/01/07/kafka-monthly-digest-december-2024
- 7: https://kafka.apache.org/39/documentation/streams/upgrade-guide
- 8: https://spring.io/blog/2024/12/17/spring-for-apache-kafka-3-3-1-and-3-2-6-available-now
- 9: https://github.com/apache/kafka/blob/trunk/docs/upgrade.html
- 10: https://docs.tibco.com/pub/msg-akd-core/3.9.0/TIB_msg-akd-core_3.9.0_relnotes.pdf
Verifying Kafka 3.9.0 Upgrade Considerations:
The web query confirms that Kafka 3.9.0 introduces several significant breaking changes—including final ZooKeeper support, dynamic KRaft controller membership, and revamped client behaviors—which makes the upgrade non-trivial. In this context, the snippet ink8s/kafka/clusters/kafka-cluster-prod.yaml
setting the cluster to version 3.9.0 with metadata version 3.9-IV0 reinforces the need to:
- Validate Client Compatibility: Ensure all Kafka clients and applications are compatible with the new behaviors and API changes in Kafka 3.9.0.
- Plan Client Updates: Identify and update any client code or configurations that might be affected by changes such as consumer group behavior modifications or deprecations of ZooKeeper-based APIs.
- Establish Rollback Procedures: Develop clear rollback procedures in case the transition to the new version leads to unanticipated issues—particularly given that some deprecated configurations now trigger fatal errors.
The review comment accurately highlights these concerns, so no changes are needed in the configuration file itself beyond ensuring that these upgrade implications are fully addressed in your overall deployment and upgrade strategy.
cd strimzi-0.42.0/ | ||
wget https://github.com/strimzi/strimzi-kafka-operator/releases/download/0.45.0/strimzi-0.45.0.tar.gz | ||
tar -zxvf strimzi-0.45.0.tar.gz | ||
cd strimzi-0.45.0/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for directory change.
The cd
command should handle failures to prevent silent errors.
-cd strimzi-0.45.0/
+cd strimzi-0.45.0/ || { echo "Failed to change directory"; exit 1; }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
cd strimzi-0.45.0/ | |
cd strimzi-0.45.0/ || { echo "Failed to change directory"; exit 1; } |
🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.
(SC2164)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @BenjaminSsempala , please consider this refactor suggestion for your next PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @BenjaminSsempala
Description
Kafka Update to Kraft Production
Summary by CodeRabbit
New Features
Chores