Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Relic Monitoring Setup #4423

Merged
merged 1 commit into from
Feb 12, 2025
Merged

New Relic Monitoring Setup #4423

merged 1 commit into from
Feb 12, 2025

Conversation

BenjaminSsempala
Copy link
Contributor

@BenjaminSsempala BenjaminSsempala commented Feb 12, 2025

Description

New Relic Monitoring Setup

Summary by CodeRabbit

  • Documentation
    • Updated instructions to guide users on installing and configuring New Relic monitoring in Kubernetes.
  • New Features
    • Introduced comprehensive configuration files and instrumentation settings for both production and staging environments.
    • Added updated Kubernetes Node definitions with refined labels to improve cluster management.
  • Chores
    • Removed outdated node configuration files to streamline the integration setup.

Copy link
Contributor

coderabbitai bot commented Feb 12, 2025

📝 Walkthrough

Walkthrough

This pull request updates the New Relic integration for Kubernetes. It enhances the README with installation instructions, adds a new instrumentation configuration for APM auto-instrumentation, and introduces environment-specific values files for production and staging. Additionally, it revises the Node resources for both airqo and airqo-stage clusters by adding new YAML definitions with updated labels and removing legacy JSON configurations.

Changes

Files Change Summary
k8s/new-relic/[README.md, instrumentation.yaml, values.prod.yaml, values.stage.yaml] Added documentation for installing New Relic via Helm, introduced a new Instrumentation resource for auto-instrumentation, and provided comprehensive configuration files with toggles for various New Relic components in production and staging environments.
k8s/nodes/airqo-k8s-[controller.yaml, worker-1.yaml, worker-2.yaml, worker-3.yaml]
k8s/nodes/airqo-k8s-worker-{0,1,2,3,4}.json (deleted)
Updated airqo Kubernetes node definitions by adding new YAML configurations with updated labels and removing legacy JSON node resources.
k8s/nodes/airqo-stage-k8s-[controller.yaml, worker-1.yaml, worker-2.yaml, worker-3.yaml]
k8s/nodes/airqo-stage-k8s-worker-{0,1,2,3,4}.json (deleted)
Revised staging Node definitions by introducing new YAML node configurations with updated roles and removing old JSON configurations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Helm
    participant KubeCluster as Kubernetes Cluster
    participant NewRelic as New Relic Bundle

    User->>Helm: Run helm install command with parameters
    Helm->>KubeCluster: Deploy New Relic bundle & enable features
    KubeCluster->>NewRelic: Initialize monitoring agents
    User->>KubeCluster: Apply instrumentation YAML for APM auto-instrumentation
Loading

Suggested reviewers

  • Baalmart
  • Psalmz777

Poem

In YAML and JSON we trust,
Nodes evolve and settings adjust,
New Relic lights the monitoring sky,
Helm commands and clusters fly high!
Code flows smooth—onward we rise! 🚀

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • JIRA integration encountered authorization issues. Please disconnect and reconnect the integration in the CodeRabbit UI.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Feb 12, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 10.07%. Comparing base (8d2fadf) to head (fb5d60b).
Report is 310 commits behind head on staging.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           staging    #4423      +/-   ##
===========================================
- Coverage    11.24%   10.07%   -1.17%     
===========================================
  Files          156      156              
  Lines        17930    21144    +3214     
  Branches       388      499     +111     
===========================================
+ Hits          2016     2130     +114     
- Misses       15912    19012    +3100     
  Partials         2        2              

see 16 files with indirect coverage changes

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🔭 Outside diff range comments (1)
k8s/nodes/airqo-stage-k8s-worker-2.yaml (1)

1-7: ⚠️ Potential issue

Fix naming inconsistency between file and node name.

The filename is airqo-stage-k8s-worker-2.yaml but the node name in the configuration is set to airqo-stage-k8s-worker-1. This inconsistency could lead to confusion in node management.

Apply this diff to align the node name with the filename:

  apiVersion: v1
  kind: Node
  metadata:
-   name: airqo-stage-k8s-worker-1
+   name: airqo-stage-k8s-worker-2
    labels:
      role: high-mem
🧹 Nitpick comments (4)
k8s/nodes/airqo-stage-k8s-worker-2.yaml (1)

5-6: Consider adding more descriptive labels for better node management.

While the role: high-mem label is good for indicating the node's resource characteristics, consider adding additional labels to improve node selection and management. This is particularly important for New Relic monitoring.

Consider adding these common Kubernetes labels:

  labels:
    role: high-mem
+   kubernetes.io/os: linux
+   node.kubernetes.io/instance-type: high-memory
+   topology.kubernetes.io/zone: your-zone
+   env: staging
k8s/new-relic/instrumentation.yaml (1)

4-6: Consider adding resource limits and security context.

The instrumentation configuration should include resource limits and security context to ensure proper resource allocation and security.

 metadata:
   name: newrelic-instrumentation
   namespace: newrelic-monitoring
+  labels:
+    app.kubernetes.io/name: newrelic-instrumentation
+    app.kubernetes.io/part-of: monitoring
 spec:
+  resources:
+    limits:
+      memory: "128Mi"
+      cpu: "100m"
+    requests:
+      memory: "64Mi"
+      cpu: "50m"
+  securityContext:
+    runAsNonRoot: true
+    runAsUser: 1000
k8s/new-relic/README.md (1)

1-4: Enhance installation documentation with security considerations.

Add a section about security best practices, including:

  • License key management using Kubernetes secrets
  • RBAC configuration
  • Network policies
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

3-3: Bare URL used
null

(MD034, no-bare-urls)

k8s/new-relic/values.stage.yaml (1)

136-140: Review privileged mode requirement.

Privileged mode is enabled which poses security risks. Verify if this is absolutely necessary.

Consider implementing a more restrictive security context if possible:

-  privileged: true
+  privileged: false
+  containerSecurityContext:
+    allowPrivilegeEscalation: false
+    capabilities:
+      drop: ["ALL"]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebf52fb and fb5d60b.

📒 Files selected for processing (22)
  • k8s/new-relic/README.md (1 hunks)
  • k8s/new-relic/instrumentation.yaml (1 hunks)
  • k8s/new-relic/values.prod.yaml (1 hunks)
  • k8s/new-relic/values.stage.yaml (1 hunks)
  • k8s/nodes/airqo-k8s-controller.yaml (1 hunks)
  • k8s/nodes/airqo-k8s-worker-0.json (0 hunks)
  • k8s/nodes/airqo-k8s-worker-1.json (0 hunks)
  • k8s/nodes/airqo-k8s-worker-1.yaml (1 hunks)
  • k8s/nodes/airqo-k8s-worker-2.json (0 hunks)
  • k8s/nodes/airqo-k8s-worker-2.yaml (1 hunks)
  • k8s/nodes/airqo-k8s-worker-3.json (0 hunks)
  • k8s/nodes/airqo-k8s-worker-3.yaml (1 hunks)
  • k8s/nodes/airqo-k8s-worker-4.json (0 hunks)
  • k8s/nodes/airqo-stage-k8s-controller.yaml (1 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-0.json (0 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-1.json (0 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-1.yaml (1 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-2.json (0 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-2.yaml (1 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-3.json (0 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-3.yaml (1 hunks)
  • k8s/nodes/airqo-stage-k8s-worker-4.json (0 hunks)
💤 Files with no reviewable changes (10)
  • k8s/nodes/airqo-k8s-worker-3.json
  • k8s/nodes/airqo-stage-k8s-worker-0.json
  • k8s/nodes/airqo-k8s-worker-1.json
  • k8s/nodes/airqo-k8s-worker-2.json
  • k8s/nodes/airqo-stage-k8s-worker-3.json
  • k8s/nodes/airqo-k8s-worker-4.json
  • k8s/nodes/airqo-stage-k8s-worker-1.json
  • k8s/nodes/airqo-stage-k8s-worker-4.json
  • k8s/nodes/airqo-stage-k8s-worker-2.json
  • k8s/nodes/airqo-k8s-worker-0.json
✅ Files skipped from review due to trivial changes (7)
  • k8s/nodes/airqo-k8s-worker-3.yaml
  • k8s/nodes/airqo-stage-k8s-controller.yaml
  • k8s/nodes/airqo-k8s-worker-2.yaml
  • k8s/nodes/airqo-stage-k8s-worker-1.yaml
  • k8s/nodes/airqo-k8s-controller.yaml
  • k8s/nodes/airqo-stage-k8s-worker-3.yaml
  • k8s/nodes/airqo-k8s-worker-1.yaml
🧰 Additional context used
🪛 LanguageTool
k8s/new-relic/README.md

[grammar] ~8-~8: There seems to be a noun/verb agreement error. Did you mean “adds” or “added”?
Context: ...lease using the values file. helm repo add newrelic https://helm-charts.newrelic.c...

(SINGULAR_NOUN_VERB_AGREEMENT)


[duplication] ~8-~8: Possible typo: you repeated a word.
Context: ...lic/nri-bundle -n newrelic-monitoring --values values.yaml 4. Run this command to enable APM...

(ENGLISH_WORD_REPEAT_RULE)

🪛 markdownlint-cli2 (0.17.2)
k8s/new-relic/README.md

3-3: Bare URL used
null

(MD034, no-bare-urls)


8-8: Bare URL used
null

(MD034, no-bare-urls)

🪛 YAMLlint (1.35.1)
k8s/new-relic/values.stage.yaml

[error] 35-35: trailing spaces

(trailing-spaces)


[warning] 113-113: wrong indentation: expected 4 but found 6

(indentation)


[warning] 117-117: wrong indentation: expected 16 but found 14

(indentation)


[error] 143-143: trailing spaces

(trailing-spaces)


[error] 151-151: trailing spaces

(trailing-spaces)

k8s/new-relic/values.prod.yaml

[error] 35-35: trailing spaces

(trailing-spaces)


[warning] 117-117: wrong indentation: expected 4 but found 6

(indentation)


[warning] 121-121: wrong indentation: expected 16 but found 14

(indentation)


[error] 147-147: trailing spaces

(trailing-spaces)


[error] 155-155: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Analyze (javascript)
🔇 Additional comments (2)
k8s/new-relic/instrumentation.yaml (1)

22-26: Verify namespace selector configuration.

The namespace selector is hardcoded to "staging". Ensure this aligns with your deployment strategy.

❌ Verification inconclusive

🏁 Script executed:

#!/bin/bash
# Verify if the staging namespace exists and has the correct label
kubectl get ns staging --show-labels

Length of output: 136


Manual Verification Required: Confirm Namespace Selector Configuration

The script couldn’t retrieve any output because the environment lacks the kubectl command. Please manually verify on your cluster that the "staging" namespace exists with the appropriate label, ensuring that this hardcoded selector meets your deployment strategy.

k8s/new-relic/README.md (1)

3-3: Verify helm chart version compatibility.

The command uses latest versions. Let's verify compatibility with our cluster version.

❌ Verification inconclusive

🌐 Web query:

What is the latest stable version of New Relic's nri-bundle helm chart and its Kubernetes compatibility matrix?

💡 Result:

The latest stable version of New Relic's nri-bundle Helm chart is version 4, which includes the Kubernetes integration appVersion v3[1]. This update focuses on configurability and security improvements while maintaining backward compatibility with reported data formats[1].

For compatibility:

  • The nri-bundle supports Kubernetes clusters running versions within n-3 of the Kubernetes client API used by Helm[9]. For example:
Helm Version Supported Kubernetes Versions
3.17.x 1.32.x - 1.29.x
... ...

Key configuration changes in nri-bundle v4 include:

  • Mandatory --port and --tls flags for discovery commands[1]
  • Simplified integrations configuration format[1]
  • Modular architecture with optional components like kube-state-metrics (KSM) and Prometheus integrations[2][4]

To verify your environment:

helm repo update
helm search repo newrelic/nri-bundle --versions

For upgrades requiring KSM adjustments[4]:

kube-state-metrics:
  enabled: false # For custom KSM instances
newrelic-infrastructure:
  ksm:
    config:
      selector: "app.kubernetes.io/name=kube-state-metrics"

Citations:


Helm Chart Compatibility Needs Manual Verification

The web query confirms that the latest stable version of New Relic’s nri-bundle chart is version 4—with notable configuration changes (e.g., mandatory --port and --tls flags and simplified integrations). It also outlines that the chart supports Kubernetes clusters running versions within “n-3” of the Helm client’s API (for example, Helm 3.17.x supports Kubernetes 1.29.x–1.32.x).

In the current README snippet, the chart version isn’t explicitly pinned, and you’re still using a legacy KSM image tag (v2.13.0). While there’s no immediate issue with the command itself, you should verify that your cluster’s Kubernetes version falls within the supported range for the version of Helm you’re using—and consider any required updates or configuration adjustments when moving to nri-bundle v4.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

3-3: Bare URL used
null

(MD034, no-bare-urls)

cluster: staging

# -- The license key for your New Relic Account. This will be preferred configuration option if both `licenseKey` and `customSecret` are specified.
licenseKey: <<LICENSE-KEY>>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Secure the license key using Kubernetes secrets.

Instead of using a placeholder, consider using Kubernetes secrets for the license key.

-  licenseKey: <<LICENSE-KEY>>
+  customSecretName: "newrelic-license"
+  customSecretLicenseKey: "license-key"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
licenseKey: <<LICENSE-KEY>>
customSecretName: "newrelic-license"
customSecretLicenseKey: "license-key"

Comment on lines +134 to +141
# -- (bool) Reduces number of metrics sent in order to reduce costs
# @default -- false
lowDataMode: true

# -- (bool) In each integration it has different behavior. See [Further information](#values-managed-globally-3) but all aims to send less metrics to the backend to try to save costs |
# @default -- false
privileged: true

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Review production security settings.

Similar to staging, production has privileged mode enabled. This should be carefully reviewed.

Consider implementing Pod Security Policies or OPA Gatekeeper policies to enforce security standards across both environments.

Comment on lines +58 to +61
k8s-agents-operator:
# k8s-agents-operator.enabled -- Install the [`k8s-agents-operator` chart](https://github.com/newrelic/k8s-agents-operator/tree/main/charts/k8s-agents-operator)
enabled: false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify agent configuration differences between environments.

The k8s-agents-operator is disabled in production but enabled in staging. This inconsistency should be reviewed.


🏁 Script executed:

#!/bin/bash
# Compare agent configurations between environments
diff k8s/new-relic/values.stage.yaml k8s/new-relic/values.prod.yaml | grep "enabled:"

Length of output: 141


Agent Configuration Discrepancy Between Environments

I verified that in the staging configuration the k8s-agents-operator.enabled flag is set to true, whereas in production it is set to false. This inconsistency should be reviewed to ensure it is intentional—especially considering any security or observability implications. Please confirm if production is meant to disable the operator or if this was an oversight.

@Baalmart Baalmart merged commit 7b61bd8 into staging Feb 12, 2025
51 of 52 checks passed
@Baalmart Baalmart deleted the new-relic branch February 12, 2025 19:05
@Baalmart Baalmart mentioned this pull request Feb 12, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants