Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka Update to Kraft Production #4427

Merged
merged 1 commit into from
Feb 13, 2025
Merged

Kafka Update to Kraft Production #4427

merged 1 commit into from
Feb 13, 2025

Conversation

BenjaminSsempala
Copy link
Contributor

@BenjaminSsempala BenjaminSsempala commented Feb 12, 2025

Description

Kafka Update to Kraft Production

Summary by CodeRabbit

  • New Features

    • Enhanced Kafka storage with dedicated persistent volumes.
    • Introduced a dual-role node pool for improved cluster flexibility.
  • Chores

    • Upgraded Kafka to version 3.9.0 with optimized resource allocations.
    • Updated the Kafka operator to the latest release.
    • Streamlined configurations by removing obsolete settings.

Copy link
Contributor

coderabbitai bot commented Feb 12, 2025

📝 Walkthrough

Walkthrough

This pull request updates the Kafka cluster production configuration and the operator deployment script. In the Kafka configuration file, three new PersistentVolumes are defined with unique storage paths, and a new KafkaNodePool is added with dual roles. The Kafka resource is updated by bumping its version to 3.9.0, adding a metadataVersion field, revising resource requests/limits, and removing legacy configurations such as inter.broker.protocol.version and Zookeeper settings. The operator deployment script now downloads and extracts Strimzi Kafka Operator version 0.45.0 instead of 0.42.0.

Changes

File(s) Change Summary
k8s/kafka/clusters/kafka-cluster-prod.yaml - Added three PersistentVolume resources with distinct names and local paths.
- Introduced a new KafkaNodePool (kafka-node-pool-dual) with replica count and dual roles (controller & broker).
- Updated Kafka version (3.6.0 → 3.9.0) and added metadataVersion.
- Adjusted resource requests/limits and removed outdated Zookeeper configurations.
k8s/kafka/operator/deployment-script.sh - Modified the script to download and extract Strimzi Kafka Operator version 0.45.0 instead of 0.42.0.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant "Deployment Script"
    participant "Strimzi Server"

    User->>Deployment Script: Execute deployment-script.sh
    Deployment Script->>Strimzi Server: Request strimzi-0.45.0.tar.gz
    Strimzi Server-->>Deployment Script: Send tarball
    Deployment Script->>Deployment Script: Extract tarball
    Deployment Script->>Deployment Script: Change directory to strimzi-0.45.0/
Loading

Possibly related PRs

Suggested reviewers

  • NicholasTurner23
  • Baalmart
  • Psalmz777

Poem

In our cluster’s realm, new volumes rise,
A node pool dual, reaching for the skies.
Versions uplifted with careful art,
Code refined, setting a brand new start.
Cheers to progress in every line—
A symphony of change so fine! 🎶

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • JIRA integration encountered authorization issues. Please disconnect and reconnect the integration in the CodeRabbit UI.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Feb 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 11.24%. Comparing base (b79bd18) to head (0c3c4cc).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##           staging    #4427   +/-   ##
========================================
  Coverage    11.24%   11.24%           
========================================
  Files          156      156           
  Lines        18006    18006           
  Branches       388      388           
========================================
  Hits          2025     2025           
  Misses       15979    15979           
  Partials         2        2           

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
k8s/kafka/clusters/kafka-cluster-prod.yaml (1)

1-64: Verify PersistentVolume paths and node availability.

The PersistentVolume configuration looks good with proper node affinity and unique storage paths. However:

  1. Ensure the paths /mnt/kafka-data* exist on the high-mem nodes
  2. Consider increasing the storage size (4Gi might be small for production)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b79bd18 and 0c3c4cc.

📒 Files selected for processing (2)
  • k8s/kafka/clusters/kafka-cluster-prod.yaml (2 hunks)
  • k8s/kafka/operator/deployment-script.sh (1 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
k8s/kafka/operator/deployment-script.sh

[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyze (python)
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (5)
k8s/kafka/operator/deployment-script.sh (1)

2-4: Verify Strimzi Kafka Operator version compatibility.

The upgrade from 0.42.0 to 0.45.0 is significant. Please ensure compatibility with Kafka 3.9.0 and KRaft mode.

✅ Verification successful

🌐 Web query:

What is the latest version of Strimzi Kafka Operator and which Kafka versions does it support?

💡 Result:

The latest version of the Strimzi Kafka Operator is 0.45.0, released on December 17, 2024[1][7]. This version supports Apache Kafka 3.8.x (3.8.1) and 3.9.x (3.9.0)[1][4][7]. Previous Kafka versions like 3.7.x were removed in this release[4].

Key Compatibility Details:

Component Supported Versions Notes
Apache Kafka 3.8.x, 3.9.x Requires KRaft mode (ZooKeeper-based clusters deprecated)[4][10]
Kubernetes 1.25+ Older Kubernetes versions (1.23-1.24) no longer supported[1][7]
Strimzi Bridge 0.31.x Updated HTTP bridge component[4]
OAuth Support 0.15.x Maintained from previous releases[1]

Important Changes:

  • Final support for ZooKeeper-based clusters - All ZooKeeper dependencies must be migrated to KRaft before upgrading beyond Strimzi 0.45[4][7][10]
  • Deprecated components removed:
    • MirrorMaker 1 (use MirrorMaker 2 instead)
    • Legacy EnvVar Configuration Provider
    • MirrorMaker 2 Extensions[4][7]
  • Requires manual intervention when upgrading from Strimzi ≤0.22[7]

For new deployments, Strimzi now enforces KRaft mode by default and requires Kubernetes ≥1.25[1][10]. The container images are available via Quay.io registry using tags matching the Kafka versions (e.g., quay.io/strimzi/kafka:3.9.0)[7].

Citations:


Strimzi Kafka Operator Compatibility Verified

  • The Strimzi Kafka Operator version 0.45.0 is the latest release and supports both Apache Kafka 3.8.x and 3.9.x, as confirmed by the documentation.
  • This release enforces KRaft mode by default, ensuring compatibility with Kafka 3.9.0 clusters when using KRaft.
  • No additional compatibility issues were detected with the upgrade from 0.42.0 to 0.45.0.
🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

k8s/kafka/clusters/kafka-cluster-prod.yaml (4)

66-85: LGTM! Well-configured KRaft node pool.

The KafkaNodePool configuration is well structured with:

  • 3 replicas for high availability
  • Combined controller and broker roles
  • Proper JBOD storage configuration with KRaft metadata sharing

92-94: LGTM! Proper KRaft mode enablement.

The annotations correctly enable both node pools and KRaft mode.


134-141: Review resource allocation for production workload.

The current resource configuration might need adjustment:

  • Memory request (500Mi) seems low for production Kafka brokers
  • CPU limit (6000m) is quite high compared to request (100m)

Consider adjusting based on your production workload patterns:

     resources:
       requests:
-        memory: "500Mi"
+        memory: "2Gi"
-        cpu: "100m"
+        cpu: "1000m"
       limits:
         memory: "2Gi"
-        cpu: "6000m"
+        cpu: "4000m"

97-98: Verify Kafka version and metadata version compatibility.

The upgrade to Kafka 3.9.0 with metadata version 3.9-IV0 is significant. Ensure:

  1. All clients support Kafka 3.9.0
  2. Plan for any necessary client updates
  3. Consider documenting rollback procedures
✅ Verification successful

🌐 Web query:

What are the breaking changes in Kafka 3.9.0 compared to previous versions?

💡 Result:

Apache Kafka 3.9 introduces several significant breaking changes and deprecations compared to earlier versions:

Core Architecture Changes

  • Final ZooKeeper Support
    3.9 is the last major release supporting ZooKeeper-based metadata storage[4][6]. This requires administrators using ZooKeeper to complete migration to KRaft mode before upgrading beyond 3.x versions[4][7].

  • Dynamic KRaft Controller Membership (KIP-853)
    Static controller quorum configurations are replaced by dynamic membership management via kafka-metadata-quorum.sh or AdminClient API[4][7]. Existing static configurations must be migrated to dynamic ones during upgrade.

Upgrade Requirements

  • Two Rolling Bounce Requirement
    Upgrades from versions ≤3.4 now require two rolling restarts of brokers rather than one[7]. This impacts maintenance planning for clusters running older versions.

Tiered Storage Changes

  • Mandatory Configuration Updates
    New tiered storage features require configuration adjustments:
    • Per-topic disablement via remote.storage.disable[4]
    • Quota enforcement through remote.log.reader.bps/remote.log.writer.bps[4]
    • Metadata handling changes in kafka-dump-log.sh tool[4]

Client/API Changes

  • Consumer Group Behavior
    Modified consumer close behavior through new leaveGroup parameter in Consumer API (KIP-1092)[2], requiring client code adjustments for explicit group departure control.

Streams and Connect

  • MirrorMaker2 Configuration Breaking Changes
    New heartbeats.replication.enabled parameter defaults differ from previous implicit behavior[2][4]. Existing mirroring setups using heartbeat topics need explicit configuration review.

Deprecation Timeline Enforcement

  • All deprecated ZooKeeper-related APIs/configuration parameters now generate fatal errors rather than warnings[4][10]. This includes:
    • zookeeper.connect configuration
    • ZK-based ACL management tools
    • Legacy migration utilities

These changes require careful planning for upgrades from pre-3.x versions due to architectural shifts in metadata management and removal of legacy components[1][7][10]. Administrators should validate KRaft configurations and test tiered storage implementations before production deployment[4][6].

Citations:


Verifying Kafka 3.9.0 Upgrade Considerations:
The web query confirms that Kafka 3.9.0 introduces several significant breaking changes—including final ZooKeeper support, dynamic KRaft controller membership, and revamped client behaviors—which makes the upgrade non-trivial. In this context, the snippet in k8s/kafka/clusters/kafka-cluster-prod.yaml setting the cluster to version 3.9.0 with metadata version 3.9-IV0 reinforces the need to:

  • Validate Client Compatibility: Ensure all Kafka clients and applications are compatible with the new behaviors and API changes in Kafka 3.9.0.
  • Plan Client Updates: Identify and update any client code or configurations that might be affected by changes such as consumer group behavior modifications or deprecations of ZooKeeper-based APIs.
  • Establish Rollback Procedures: Develop clear rollback procedures in case the transition to the new version leads to unanticipated issues—particularly given that some deprecated configurations now trigger fatal errors.

The review comment accurately highlights these concerns, so no changes are needed in the configuration file itself beyond ensuring that these upgrade implications are fully addressed in your overall deployment and upgrade strategy.

cd strimzi-0.42.0/
wget https://github.com/strimzi/strimzi-kafka-operator/releases/download/0.45.0/strimzi-0.45.0.tar.gz
tar -zxvf strimzi-0.45.0.tar.gz
cd strimzi-0.45.0/
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for directory change.

The cd command should handle failures to prevent silent errors.

-cd  strimzi-0.45.0/
+cd strimzi-0.45.0/ || { echo "Failed to change directory"; exit 1; }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cd strimzi-0.45.0/
cd strimzi-0.45.0/ || { echo "Failed to change directory"; exit 1; }
🧰 Tools
🪛 Shellcheck (0.10.0)

[warning] 4-4: Use 'cd ... || exit' or 'cd ... || return' in case cd fails.

(SC2164)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BenjaminSsempala , please consider this refactor suggestion for your next PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor

@Baalmart Baalmart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Baalmart Baalmart merged commit f875cd3 into staging Feb 13, 2025
52 checks passed
@Baalmart Baalmart deleted the kafka-prod branch February 13, 2025 04:58
@Baalmart Baalmart mentioned this pull request Feb 13, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants