Conversation
There was a problem hiding this comment.
Pull request overview
This pull request partially reverts PR #19707, which introduced a dependency on LongRunningOperationService for Examine index rebuilding. That change caused index rebuilding to be skipped on cold boot, as pending operations from the database prevented the rebuild from occurring. This PR fixes the issue by switching from database-persisted operation tracking to an in-memory ConcurrentDictionary<string, Task> to track active rebuild operations.
Changes:
- Removed
ILongRunningOperationServicedependency fromExamineIndexRebuilderconstructor - Implemented in-memory task tracking using a static
ConcurrentDictionaryto prevent multiple concurrent rebuilds - Changed delay mechanism from
Task.DelaytoThread.Sleepin rebuild methods - Simplified
IsRebuildingAsyncto check in-memory task dictionary instead of database
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| src/Umbraco.Infrastructure/Examine/ExamineIndexRebuilder.cs | Removed LongRunningOperationService dependency and implemented in-memory task tracking using static ConcurrentDictionary |
| tests/Umbraco.Tests.Integration/DependencyInjection/UmbracoBuilderExtensions.cs | Updated test class constructor to remove ILongRunningOperationService parameter |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
src/Umbraco.Infrastructure/Services/IndexingRebuilderService.cs
Outdated
Show resolved
Hide resolved
src/Umbraco.Infrastructure/Services/IndexingRebuilderService.cs
Outdated
Show resolved
Hide resolved
…exes concurrently.
AndyButland
left a comment
There was a problem hiding this comment.
I've done a bit of work on this @Zeegaan - please have a read through and look at the commits, and let me know what you think.
Firstly, you've identified an oversight here. We need to consider Examine rebuilds as different to database cache rebuilds. Both currently use the long running operation service, but they differ in that the former are per instance rather than per "set of load balanced servers".
So first I looked at the in-memory reversion. This looks good, but I found a problem that would manifest itself in a triggered rebuild continuing to spin and not reporting that it was complete when the indexing had finished.
This was because we had an AppCaches runtime cache with a 5-minute TTL to track whether an index is rebuilding. This cache was never cleared on the success path, so IsRebuildingAsync would report "rebuilding" for up to 5 minutes after completion. I replaced this with a ConcurrentDictionary in ExamineIndexRebuilder that is set when a rebuild starts and cleared in a finally block when it finishes.
I also found that ExamineIndexRebuilder was registered as both Singleton and Transient in the same method. I removed the latter so the Singleton is respected and all consumers share the same state (which is necessary for tracking rebuilds).
I also reworked the obsoletions such that the synchronous Thread.Sleep is used only for the obsolete callers.
Since tracking now lives in ExamineIndexRebuilder, I could simplify IndexingRebuilderService back to a pure delegator with no caching logic.
Then I wanted to look at the reason why the PR you have reverted was considered necessary. We do have a problem again of a user triggering a rebuild on Server A, the next status poll lands on Server B, and Server B's state has no knowledge of Server A's rebuild.
So I thought if we could get the best of both worlds, and add cross-server rebuild status visibility for the load-balanced backoffice scenario. IndexingRebuilderService now wraps user-triggered rebuilds in ILongRunningOperationService.RunAsync(), so when a user triggers a rebuild on Server A and polls for status from Server B, the status is visible via the shared database. The IsRebuildingAsync method checks local in-memory state first (fast path), then falls back to the database for cross-server visibility, with graceful degradation if the DB query fails.
To support subscriber servers with read-only database access, all ILongRunningOperationService calls are gated behind a UseDatabaseOperationTracking check that only returns true for Single or SchedulingPublisher server roles. Subscriber servers fall back to local-only tracking, but that's OK, because if you have these servers, you are doing load balancing the older way and not load balancing the backoffice itself.
The controllers (DetailsIndexerController, AllIndexerController) and the IndexPresentationFactory test were also updated to use the async CreateAsync path instead of the obsolete sync Create method.
With this in place I'm seeing the backoffice triggered rebuilds work as expected.
I also see the indexes being created on start-up when I follow the test steps you described above.
|
@AndyButland I think your approach is much better! 💪 |
|
Thanks @Zeegaan - I've fixed the build by resolving the breaking change in a constructor. |
|
@AndyButland this looks and tests out great 💪 |
Fixes (#21716)
Notes
LongRunningOperationService, which in turned caused the above issues, as index rebuilding was skipped on cold boot.How to test
To replicate the original behavior
umbracoLongRunningOperationtable to fake that we are currently rebuilding another server like this:RebuildOnStartupHandler)