Skip to content

Comments

Examine: Keep track of rebuilding in memory#21821

Open
Zeegaan wants to merge 15 commits intomainfrom
v17/bugfix/#21716
Open

Examine: Keep track of rebuilding in memory#21821
Zeegaan wants to merge 15 commits intomainfrom
v17/bugfix/#21716

Conversation

@Zeegaan
Copy link
Member

@Zeegaan Zeegaan commented Feb 19, 2026

Fixes (#21716)

Notes

  • Partial revert of Use new submit and poll solution for examine index rebuild #19707
  • This PR made examine use the LongRunningOperationService, which in turned caused the above issues, as index rebuilding was skipped on cold boot.
  • This now keeps track of the rebuilding in memory, by using an in memory cache, just as we did before.
  • I think we need to rethink how examine works in a load-balanced backoffice, for example you can get false positives if you're not using sticky sessions, but it will be addressed in another PR, this just gets load balancing with examine to work again 😁

How to test

To replicate the original behavior

  • Start the site, create some documents
  • Navigate to the examine management dashboard in Settings -> Examine management, and assert the indexes have documents in them
  • Stop the site
  • Open up your local folder that contains the indexes (Umbraco/Data/Temp/ExamineIndexes), and delete the indexes in there
  • Add a record to the umbracoLongRunningOperation table to fake that we are currently rebuilding another server like this:
id	type	status	result	createDate	updateDate	expirationDate
019C73E3-99F3-71BD-8E48-AC9EDD59D8C8	RebuildAllExamineIndexes	Enqueued	NULL	2026-02-19 03:13:41.877	2026-02-19 03:13:41.877	2026-02-19 03:18:41.877
  • remember that the expiration has to be later than the current date and time 😛
  • Start up your site again, wait one minute (or remove the delay from RebuildOnStartupHandler)
  • Assert that your indexes are still empty, and will remain empty, untill you manually trigger a rebuild

Copilot AI review requested due to automatic review settings February 19, 2026 04:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request partially reverts PR #19707, which introduced a dependency on LongRunningOperationService for Examine index rebuilding. That change caused index rebuilding to be skipped on cold boot, as pending operations from the database prevented the rebuild from occurring. This PR fixes the issue by switching from database-persisted operation tracking to an in-memory ConcurrentDictionary<string, Task> to track active rebuild operations.

Changes:

  • Removed ILongRunningOperationService dependency from ExamineIndexRebuilder constructor
  • Implemented in-memory task tracking using a static ConcurrentDictionary to prevent multiple concurrent rebuilds
  • Changed delay mechanism from Task.Delay to Thread.Sleep in rebuild methods
  • Simplified IsRebuildingAsync to check in-memory task dictionary instead of database

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/Umbraco.Infrastructure/Examine/ExamineIndexRebuilder.cs Removed LongRunningOperationService dependency and implemented in-memory task tracking using static ConcurrentDictionary
tests/Umbraco.Tests.Integration/DependencyInjection/UmbracoBuilderExtensions.cs Updated test class constructor to remove ILongRunningOperationService parameter

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

@Zeegaan Zeegaan changed the title V17: Keep track of rebuilding in memory Examine: Keep track of rebuilding in memory Feb 20, 2026
Copy link
Contributor

@AndyButland AndyButland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a bit of work on this @Zeegaan - please have a read through and look at the commits, and let me know what you think.

Firstly, you've identified an oversight here. We need to consider Examine rebuilds as different to database cache rebuilds. Both currently use the long running operation service, but they differ in that the former are per instance rather than per "set of load balanced servers".

So first I looked at the in-memory reversion. This looks good, but I found a problem that would manifest itself in a triggered rebuild continuing to spin and not reporting that it was complete when the indexing had finished.

This was because we had an AppCaches runtime cache with a 5-minute TTL to track whether an index is rebuilding. This cache was never cleared on the success path, so IsRebuildingAsync would report "rebuilding" for up to 5 minutes after completion. I replaced this with a ConcurrentDictionary in ExamineIndexRebuilder that is set when a rebuild starts and cleared in a finally block when it finishes.

I also found that ExamineIndexRebuilder was registered as both Singleton and Transient in the same method. I removed the latter so the Singleton is respected and all consumers share the same state (which is necessary for tracking rebuilds).

I also reworked the obsoletions such that the synchronous Thread.Sleep is used only for the obsolete callers.

Since tracking now lives in ExamineIndexRebuilder, I could simplify IndexingRebuilderService back to a pure delegator with no caching logic.


Then I wanted to look at the reason why the PR you have reverted was considered necessary. We do have a problem again of a user triggering a rebuild on Server A, the next status poll lands on Server B, and Server B's state has no knowledge of Server A's rebuild.

So I thought if we could get the best of both worlds, and add cross-server rebuild status visibility for the load-balanced backoffice scenario. IndexingRebuilderService now wraps user-triggered rebuilds in ILongRunningOperationService.RunAsync(), so when a user triggers a rebuild on Server A and polls for status from Server B, the status is visible via the shared database. The IsRebuildingAsync method checks local in-memory state first (fast path), then falls back to the database for cross-server visibility, with graceful degradation if the DB query fails.

To support subscriber servers with read-only database access, all ILongRunningOperationService calls are gated behind a UseDatabaseOperationTracking check that only returns true for Single or SchedulingPublisher server roles. Subscriber servers fall back to local-only tracking, but that's OK, because if you have these servers, you are doing load balancing the older way and not load balancing the backoffice itself.

The controllers (DetailsIndexerController, AllIndexerController) and the IndexPresentationFactory test were also updated to use the async CreateAsync path instead of the obsolete sync Create method.


With this in place I'm seeing the backoffice triggered rebuilds work as expected.

I also see the indexes being created on start-up when I follow the test steps you described above.

@Zeegaan
Copy link
Member Author

Zeegaan commented Feb 22, 2026

@AndyButland I think your approach is much better! 💪
It's not building right now, so let me know when it's ready and I'll take a look 😁

@AndyButland
Copy link
Contributor

Thanks @Zeegaan - I've fixed the build by resolving the breaking change in a constructor.

@Zeegaan
Copy link
Member Author

Zeegaan commented Feb 24, 2026

@AndyButland this looks and tests out great 💪
Note that when we rebuild on startup we no longer use the LongOperationService, so it will always be bypassed (which is the desired behavior).
If you're happoy with it, lets get it merged 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants