Fix concurrent cache access in AcquireToken method by 4gust · Pull Request #578 · AzureAD/microsoft-authentication-library-for-go

4gust · 2025-08-28T14:49:34Z

PR: Fix Concurrent Cache Access in `AcquireToken` Method

Summary

This PR addresses a potential race condition in the AcquireToken method by properly synchronizing access to the cache using a mutex. Issue number #569

Changes

Introduced cacheAccessorMu.Lock() / defer cacheAccessorMu.Unlock() around cache operations to ensure thread-safe access.
Removed the canRefresh atomic variable, which was previously used to prevent concurrent refreshes which is no longer needed as whole cache is in mutex

Testing

Validated using the existing TestAcquireTokenConcurrency unit test to ensure correct behavior under concurrent conditions.

Breaking Changes

None. This is an internal, non-breaking implementation change that does not affect the public API.

sonarqubecloud · 2025-08-28T14:50:03Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

chlowell · 2025-09-25T20:35:26Z

 		httpClient:         shared.DefaultClient,
 		retryPolicyEnabled: true,
 		source:             source,
-		canRefresh:         &zero,


Also remove the declaration of zero

chlowell · 2025-09-25T20:38:53Z

+	for err := range errors {
+		t.Error(err)
+	}


Consider this approach instead. If there were 100 identical errors, would you want to see all of them in go test output?

chlowell · 2025-09-25T20:41:27Z

+
+	// Mock client should only need to respond once if caching works correctly
+	mockClient := mock.NewClient()
+	// Add multiple responses in case caching fails (but we'll verify it doesn't)


if you simply appended 1 response the mock would panic when it gets a second request and you wouldn't have to count the requests yourself

chlowell · 2025-09-25T20:43:48Z

+				return
+			}
+
+			// Capture the token received


why? There can only be one token because the mock client returns a static value, and you don't need this to know whether the goroutine succeeded because if it didn't it would have written an error to the channel

Robbie-Microsoft · 2026-04-22T20:32:21Z

 	c.authParams.Scopes = []string{resource}

+	cacheAccessorMu.Lock()
+	defer cacheAccessorMu.Unlock()


The lock is held for the entire duration of AcquireToken, including the c.getToken() HTTP call on the refresh path. This means all goroutines calling AcquireToken are fully serialized globally — if IMDS is slow or times out (up to 30s), every other goroutine blocks.

The original canRefresh CAS was more surgical: it only gated concurrent refreshes; cache hits were still concurrent. Consider holding the lock only around cache reads and writes, not across the network call:

cacheAccessorMu.RLock() stResp, err := cacheManager.Read(...) ar, err := base.AuthResultFromStorage(stResp) cacheAccessorMu.RUnlock() if needsRefresh { tr, er := c.getToken(ctx, resource) // no lock held during HTTP if er == nil { // re-acquire write lock to update cache return tr, nil } }

Robbie-Microsoft · 2026-04-22T20:32:31Z


 // cache never uses the client because instance discovery is always disabled.
 var cacheManager *storage.Manager = storage.New(nil)
+var cacheAccessorMu *sync.RWMutex = &sync.RWMutex{}


cacheAccessorMu is declared as *sync.RWMutex but Lock() (exclusive write lock) is always called — RLock() is never used. This gives no benefit over a plain sync.Mutex and adds confusion. Either switch to sync.Mutex:

var cacheAccessorMu sync.Mutex

or use RLock/RUnlock for the read-only cache lookup path and reserve Lock only for writes (which also addresses the HTTP-hold concern above).

Robbie-Microsoft · 2026-04-22T20:32:43Z

Design note: global mutex serializes across all Client instances

Both cacheManager and the new cacheAccessorMu are package-level globals, so all Client instances — regardless of resource or identity — contend on the same lock. A caller that creates two clients for different resources will have their token acquisitions serialized against each other even though they have no shared state at the application level.

This is a pre-existing consequence of cacheManager being global, but the new mutex makes it more visible. It may be worth considering a per-client sync.Mutex paired with a per-client cache, or at minimum documenting the serialization behavior so callers know concurrent multi-resource clients don't actually run concurrently.

sonarqubecloud · 2026-04-24T08:39:17Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

added mutex for cache in Managedidentity

b1a0758

4gust requested review from bgavrilMS and rayluo as code owners August 28, 2025 14:49

4gust requested a review from chlowell August 28, 2025 14:49

bgavrilMS approved these changes Aug 28, 2025

View reviewed changes

bgavrilMS linked an issue Sep 4, 2025 that may be closed by this pull request

[Bug] Concurrent managedidentity.Client.AcquireToken() calls cause multiple token requests to identity provider before caching #569

Open

8 tasks

chlowell requested changes Sep 25, 2025

View reviewed changes

resolved comments

ac870d4

Robbie-Microsoft reviewed Apr 22, 2026

View reviewed changes

updated the concurrent cache refresh with mutex

3275115

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix concurrent cache access in AcquireToken method#578

Fix concurrent cache access in AcquireToken method#578
4gust wants to merge 3 commits into
mainfrom
4gust/concurrent-cache-failure

4gust commented Aug 28, 2025

Uh oh!

sonarqubecloud Bot commented Aug 28, 2025

Uh oh!

chlowell Sep 25, 2025

Uh oh!

chlowell Sep 25, 2025

Uh oh!

chlowell Sep 25, 2025

Uh oh!

chlowell Sep 25, 2025

Uh oh!

Robbie-Microsoft Apr 22, 2026

Uh oh!

Robbie-Microsoft Apr 22, 2026

Uh oh!

Robbie-Microsoft commented Apr 22, 2026

Uh oh!

sonarqubecloud Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

4gust commented Aug 28, 2025

PR: Fix Concurrent Cache Access in AcquireToken Method

Summary

Changes

Testing

Breaking Changes

Uh oh!

sonarqubecloud Bot commented Aug 28, 2025

Quality Gate passed

Uh oh!

chlowell Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

chlowell Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

chlowell Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

chlowell Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Robbie-Microsoft Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Robbie-Microsoft Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Robbie-Microsoft commented Apr 22, 2026

Uh oh!

sonarqubecloud Bot commented Apr 24, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PR: Fix Concurrent Cache Access in `AcquireToken` Method