Skip to content

feat(spire): enable persistent disk storage for keys#329

Open
maishivamhoo123 wants to merge 5 commits intocontainer-registry:mainfrom
maishivamhoo123:feat/spire-persistence
Open

feat(spire): enable persistent disk storage for keys#329
maishivamhoo123 wants to merge 5 commits intocontainer-registry:mainfrom
maishivamhoo123:feat/spire-persistence

Conversation

@maishivamhoo123
Copy link

@maishivamhoo123 maishivamhoo123 commented Feb 15, 2026

Description

This PR enables persistent storage for SPIRE Server keys, resolving an issue where keys were lost upon restart (previously stored in memory).

Changes

  • Updated KeyManager: Switched from "memory" to "disk" in internal/spiffe/embedded_server.go.
  • Configured Storage Path: Set keys_path to <data_dir>/keys.json.
  • Added Tests: Created internal/spiffe/persistence_test.go to verify:
    • Correct generation of server.conf (checking for "disk" and keys_path).
    • End-to-End persistence behavior (skips gracefully if spire-server binary is missing).

Additional context

Verified locally with go test ./internal/spiffe/....


Summary by cubic

Enable persistent disk storage for SPIRE Server keys so they survive restarts, replacing the in-memory KeyManager. Addresses #288.

  • Bug Fixes
    • Switched KeyManager to "disk" and set keys_path to <data_dir>/keys.json in the embedded server config.
    • Refactored tests and strengthened E2E assertions: exact keys_path check, start/stop helpers, fail on any keys.json stat error, ensure keys.json is non-empty; skip if spire-server is missing.

Written for commit fed3be5. Summary will update on new commits.

Summary by CodeRabbit

  • Infrastructure
    • SPIRE server now uses disk-based persistence for configuration and cryptographic keys, with automatic management of persistent data locations.
  • Tests
    • Added unit and end-to-end tests to validate the storage configuration and to verify keys are persisted and remain usable across server stop/start cycles.

Signed-off-by: maishivamhoo123 <maishivamhoo@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 15, 2026

📝 Walkthrough

Walkthrough

Switch the embedded SPIRE server KeyManager from in-memory to disk-backed, add a keys_path template parameter, update config write call to include dataDir, and add unit and end-to-end tests that validate disk persistence of SPIRE keys across restarts.

Changes

Cohort / File(s) Summary
SPIRE Server Configuration
ground-control/internal/spiffe/embedded_server.go
Replace KeyManager "memory" with "disk", add plugin_data containing keys_path = "<data_dir>/keys.json", and update the writeConfig invocation to pass the additional dataDir argument (template parameter order changed).
Persistence Tests
ground-control/internal/spiffe/persistence_test.go
Add tests TestSpireConfigUsesDiskPersistence and TestEndToEndPersistence, helper setupTestConfig, and os/exec import; tests write/read server.conf, verify keys.json creation, and exercise server restart to confirm persisted keys are reused.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: enabling persistent disk storage for SPIRE keys, which is the primary objective of this PR.
Linked Issues check ✅ Passed The PR successfully implements Phase 1 of issue #288 by switching KeyManager to disk, configuring keys_path, adding comprehensive tests, and meeting all coding requirements.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing disk-based persistence for SPIRE keys as required by issue #288; no unrelated modifications detected.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main
Description check ✅ Passed The pull request description comprehensively covers all required template sections with clear details about changes, testing, and issue resolution.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link

codacy-production bot commented Feb 15, 2026

Codacy's Analysis Summary

0 new issue (≤ 0 issue)
0 new security issue
19 complexity
0 duplications

Review Pull Request in Codacy →

AI Reviewer available: add the codacy-review label to get contextual insights without leaving GitHub.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files


Since this is your first cubic review, here's how it works:

  • cubic automatically reviews your code and comments on bugs and improvements
  • Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
  • Add one-off context when rerunning by tagging @cubic-dev-ai with guidance or docs links (including llms.txt)
  • Ask questions if you need clarification on any suggestion

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@ground-control/internal/spiffe/persistence_test.go`:
- Around line 17-23: In TestSpireConfigUsesDiskPersistence replace the manual
temp dir creation (os.MkdirTemp + err check + defer os.RemoveAll) with
t.TempDir() — call tmpDir := t.TempDir() and remove the os.MkdirTemp error
handling and defer os.RemoveAll(tmpDir) lines so the test uses the testing
package’s automatic cleanup.
- Around line 97-102: Replace manual temporary directory creation using
os.MkdirTemp and explicit cleanup with the testing helper t.TempDir(): remove
the error handling and defer os.RemoveAll(tmpDir) around the tmpDir variable and
instead assign tmpDir := t.TempDir() inside the test (where tmpDir and err are
currently declared), so the test runtime handles cleanup automatically; update
references to tmpDir accordingly (look for the tmpDir variable in this test in
persistence_test.go).
- Around line 120-125: The goroutine calling t.Logf (inside the anonymous
goroutine invoking server1.Start(ctx1) and likewise for the server2 block around
lines 152-156) can race with the test and panic; instead either send the error
back to the test via a channel (or use a sync.WaitGroup) and call t.Logf from
the main test goroutine, or replace the in-goroutine t.Logf with a non-test
logger such as fmt.Fprintf(os.Stderr, ...) so logging does not use the testing.T
from a non-test goroutine; ensure the goroutine's lifetime is coordinated with
the test (via context cancellation or waiting) so Start(ctx1) cannot outlive the
test.
🧹 Nitpick comments (4)
ground-control/internal/spiffe/embedded_server.go (2)

195-199: Inconsistent indentation in HCL config template.

Lines 197–198 use hard tabs while the rest of the template uses spaces. This produces misaligned output in the generated server.conf, which hurts readability and could confuse HCL-aware tooling.

🔧 Align indentation with the rest of the template
     KeyManager "disk" {
         plugin_data {
-		    keys_path = "%s/keys.json"
-		}
+            keys_path = "%s/keys.json"
+        }
     }

174-209: Consider HCL injection risk from unescaped DataDir in fmt.Sprintf.

DataDir is interpolated into the HCL config string six times (including the new keys_path) without any escaping. If DataDir contains a " or newline, the generated config will be malformed or could alter the config semantics. While unlikely in normal usage, a defensive strings.ContainsAny check (or using an HCL library) would harden this.

ground-control/internal/spiffe/persistence_test.go (2)

36-46: Calling server.Start() just to trigger writeConfig() is unnecessarily heavy.

Start() also launches a subprocess and blocks waiting for readiness. Since the test only cares about the generated config file, calling the unexported writeConfig() directly (or exporting it / extracting the config-generation logic) would make this test faster, deterministic, and free of goroutine/sleep hacks.

As a simpler alternative within the current design, note that writeConfig is an unexported method on the same package—you can call it directly from the test:

♻️ Simplify by calling writeConfig directly
 	server := NewEmbeddedSpireServer(cfg)
 
-	// 3. Trigger config writing by attempting to start
-	// We run this in a goroutine because Start() might block or fail if binary is missing
-	ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
-	defer cancel()
-
-	go func() {
-		_ = server.Start(ctx)
-	}()
-
-	// Give it a tiny moment to write the file
-	time.Sleep(100 * time.Millisecond)
+	// Trigger config writing directly (same package, unexported method accessible)
+	if err := os.MkdirAll(cfg.DataDir, 0o700); err != nil {
+		t.Fatal(err)
+	}
+	if err := server.writeConfig(); err != nil {
+		t.Fatal(err)
+	}

This also eliminates the retry loop (lines 52–60) since the file is guaranteed to exist after writeConfig returns.


88-94: This is an E2E/integration test—consider gating it with a build tag or test flag.

The TestEndToEndPersistence test starts real SPIRE server subprocesses, sleeps for seconds, and only skips at runtime if the binary is absent. A build tag (e.g., //go:build integration) or -short flag check would let CI and local go test skip it by default, keeping the fast unit-test path clean.

Signed-off-by: maishivamhoo123 <maishivamhoo@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ground-control/internal/spiffe/persistence_test.go`:
- Around line 73-96: The background goroutine calling server1.Start must
propagate errors and you must avoid blind sleeps: create an error channel
(errCh) and start server1 in a goroutine that sends any Start error to errCh;
wait (with a timeout context) for either Start to return successfully or an
error on errCh instead of sleeping; after the server is confirmed running, call
server1.Stop() explicitly (not just canceling exec context) to allow graceful
shutdown and ensure keys.json is flushed, then check keys.json existence;
reference server1.Start, server1.Stop, ctx1/cancel1 and keysPath when making
these changes.
🧹 Nitpick comments (4)
ground-control/internal/spiffe/embedded_server.go (1)

195-199: Minor indentation inconsistency in the config template.

Line 195 uses 3-space indent ( KeyManager) while the sibling plugin blocks (DataStore on Line 184, NodeAttestor on Line 191) use 4-space indent. Not functionally impactful (HCL is whitespace-tolerant), but worth aligning for readability.

🔧 Suggested fix
-   KeyManager "disk" {
+    KeyManager "disk" {
ground-control/internal/spiffe/persistence_test.go (3)

32-35: Redundant os.MkdirAllt.TempDir() already creates the directory.

setupTestConfig uses t.TempDir() which returns an already-existing directory. The os.MkdirAll on line 33 is a no-op.

♻️ Suggested fix
 	cfg := setupTestConfig(t)
 
-	// Create DataDir explicitly
-	if err := os.MkdirAll(cfg.DataDir, 0700); err != nil {
-		t.Fatal(err)
-	}
-
 	server := NewEmbeddedSpireServer(cfg)

54-60: Consider also verifying the exact keys_path value, not just its presence.

The test checks that keys_path = exists but doesn't verify it points to the correct directory. A stronger assertion would be:

expectedKeysPath := fmt.Sprintf(`keys_path = "%s/keys.json"`, cfg.DataDir)
if !strings.Contains(content, expectedKeysPath) {
    t.Errorf("Config has wrong keys_path. Expected %q in:\n%s", expectedKeysPath, content)
}

7-7: Remove the inline comment on the import.

// Added this for LookPath is commit-log noise that doesn't belong in the source.

-	"os/exec" // Added this for LookPath
+	"os/exec"

Signed-off-by: maishivamhoo123 <maishivamhoo@gmail.com>
Signed-off-by: maishivamhoo123 <maishivamhoo@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ground-control/internal/spiffe/persistence_test.go`:
- Around line 94-98: The test currently uses os.Stat(keysPath) with
os.IsNotExist(err) which ignores other errors; change the check to call
os.Stat(keysPath), if err != nil then call t.Fatalf or t.Errorf with the actual
err to fail the test (instead of only checking os.IsNotExist), otherwise proceed
— update the block referencing keysPath, cfg.DataDir, os.Stat and t.Error so any
non-nil err (permission, IO, etc.) causes the test to fail and prints the error
details.
🧹 Nitpick comments (2)
ground-control/internal/spiffe/persistence_test.go (2)

73-76: E2E test should live in test/e2e/ per project conventions.

This is an end-to-end test (binary lookup, full server lifecycle) placed in the unit-test file. The project convention requires E2E tests in test/e2e/ with configurations in test/e2e/testconfig/. The t.Skip guard mitigates CI breakage, but consider relocating this test to the correct directory.

Based on learnings: "E2E tests must be located in test/e2e/ with test configurations in test/e2e/testconfig/"


100-109: Server 2 restart block doesn't assert anything beyond "no crash".

After restarting the server with persisted keys, consider asserting that keys.json still exists (or hasn't changed) and potentially that the server is actually healthy — otherwise the restart test only proves the binary can start twice, not that it reused the persisted keys.

Signed-off-by: maishivamhoo123 <maishivamhoo@gmail.com>
@maishivamhoo123
Copy link
Author

@bupd @Vad1mo and @team can you please review this PR?

@bupd bupd self-requested a review February 17, 2026 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Embedded SPIRE: Use Persistent KeyManager Instead of In-Memory

1 participant