Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: Have UpdateMachine.load check account still exists after async gaps #1386

Merged
merged 10 commits into from
Mar 13, 2025

Conversation

chrisbobbe
Copy link
Collaborator

Fixes: #1354

@chrisbobbe chrisbobbe added the maintainer review PR ready for review by Zulip maintainers label Mar 7, 2025
@chrisbobbe chrisbobbe requested review from gnprice and PIG208 March 7, 2025 01:55
Copy link
Member

@PIG208 PIG208 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I like the new UpdateMachineTestGlobalStore. Left some thoughts and comments.

/// Load per-account data for the given account, unconditionally.
/// Load per-account data for the given account ID, unconditionally.
///
/// The account for `accountId` must exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this explicit in dartdocs seems useful; I think this also applies to removeAccount?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the validity of the accountId, I feel that this is analogous to memory management — an object is allocated some memory, and we have pointers to it in the data store; freeing that object leaves the pointers dangling, hence the need for null-checks and assertions.

I vaguely feel that we can make stopAndThrowIfNoAccount available as a helper similar to getAccount where the account is required, and only check for its existence on use, not after async gaps, but I'm not sure if that will be cleaner/easier to maintain.

Copy link
Collaborator Author

@chrisbobbe chrisbobbe Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think I wouldn't be satisfied with an approach like that.

Within some synchronous code, say point A is where we retrieve the Account data (getAccount), and points B, C, and D are where we need to make decisions about it or pass it across some boundary where point A couldn't go (like if we call a platform API). I'd be happier if we could freely put point A where it makes the most sense for code organization, which will depend on the context; maybe we want to shorten the paths to B, C, and D, or maybe to put it neatly at the top of a function body. If the placement of point A affects behavior (because that's where we do the null-check), It feels much less flexible and harder to reason about. We can get that flexibility by doing stopAndThrowIfNoAccount after the async gaps.

only check for its existence on use

This wouldn't answer the need to interrupt the retry loop in _registerQueueWithRetry (the symptom I mentioned at #1354 (comment) ). We need to know if the account has been logged out, to stop retrying in that case, but we don't actually use the account's data.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bumping #1386 (comment); I guess we still want to mention the constraint to its dartdoc?


Thanks for the explanation. I think Account will go well with a version of use_build_context_synchronously, if we have a way to annotate this type somehow.

class TestGlobalStore extends GlobalStore {
TestGlobalStore({required super.accounts});

mixin _ApiConnectionsMixin on GlobalStore {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool! I think the layout of these mixins will probably be helpful guidance to later break down GlobalStore itself as well; similar to what we did with ChannelStore, etc.

/// See also:
/// * [TestZulipBinding.globalStore], which provides one of these.
/// * [UpdateMachineTestGlobalStore], which prepares per-account data
/// using [UpdateMachine.load] (like [LiveGlobalStore] does)..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is the extra "." intentional here?

Comment on lines 1041 to 1089
connection.prepare(
apiException: eg.apiExceptionBadEventQueueId());
globalStore.clearCachedApiConnections();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand how this works. We prepare an error with the current connection, but call clearCachedApiConnections immediately after.

Is it expected that the update machine still has the connection with the prepared response when polling, but it won't be after reloading?

Copy link
Collaborator Author

@chrisbobbe chrisbobbe Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, thanks for pointing out that this is confusing. I wasn't sure where would be the least confusing place to put globalStore.clearCachedApiConnections();.

My idea in putting it here was that it's right next to the globalStore.prepareRegisterQueueResponse =(newConnection) { /* … */ }.

The "cache" is test infrastructure; see _ApiConnectionsMixin.useCachedApiConnections. When we look at app code that runs during a test, the cache is only relevant when the code asks to create a new ApiConnection, which it does with GlobalStore.apiConnection or GlobalStore.apiConnectionFromAccount.

In the app code exercised by these tests, that happens twice:

  1. When the test calls await globalStore.perAccount (near the top of prepareReload)
  2. When the UpdateMachine responds to the bad-event-queue error by re-registering.

The test 'user logged out before new store is loaded' would break if step 2 picked up a connection that was cached in step 1, because the one from step 1 gets disposed on logout.

The UpdateMachineTestGlobalStore uses the caching behavior to make doLoadPerAccount work, but it's still fine to clear the cache between calls to doLoadPerAccount.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to make it more clear in my next revision :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a good comment on this clearCachedApiConnections call can probably replace the need for this bit below:

        // Assert this is actually a new connection, not the one on
        // globalStore._perAccountStores, which at least one test will close
        // via logOutAccount.
        assert(!identical(connection, newConnection));

(Because that assert is basically checking that this clearCached… happened, right?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! The new ordering and block of comments makes it quite clear to me.

final globalStore = UpdateMachineTestGlobalStore(accounts: [eg.selfAccount]);
globalStore.prepareRegisterQueueResponse = (connection) =>
connection.prepare(
delay: TestGlobalStore.removeAccountDuration + Duration(seconds: 1),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because both UpdateMachineTestGlobalStore and TestGlobalStore share removeAccountDuration through _DatabaseMixin, perhaps we should find a new home to this constant. Maybe a "k"-prefixed const?

Copy link
Collaborator Author

@chrisbobbe chrisbobbe Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe; let's see if @gnprice has thoughts on that. I showed him in the office that a static _DatabaseMixin.removeAccountDuration wouldn't work (callers would have to access it using the mixin's name, like FooMixin.removeAccountDuration, and it doesn't make sense for the mixin to be public). Greg suggested just leaving it as a static on TestGlobalStore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is fine. If we find ourselves with more items like this, maybe we'll find some other way to organize them.

final stopwatch = Stopwatch()..start();
final initialSnapshot = await _registerQueueWithRetry(connection);
final initialSnapshot = await _registerQueueWithRetry(connection,
stopAndThrowIfNoAccount: stopAndThrowIfNoAccount);
final t = (stopwatch..stop()).elapsed;
assert(debugLog("initial fetch time: ${t.inMilliseconds}ms"));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further down there is another async gap at await globalStore.updateAccount; should we call stopAndThrowIfNoAccount there as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there isn't a bug that we could catch by doing so.

  1. We hope and expect that await globalStore.updateAccount always fails if the account is removed concurrently. If it doesn't, that's a bug in the database implementation.
  2. If await globalStore.updateAccount succeeds, that means there's still an Account object in GlobalStore._accounts. Since stopAndThrowIfNoAccount looks at GlobalStore._accounts too, it would find it there and wouldn't throw. So even if the (1) bug exists, stopAndThrowIfNoAccount wouldn't help us catch it.

store: store, initialSnapshot: eg.initialSnapshot());
updateMachine.debugPauseLoop();
final store = await globalStore.perAccount(eg.selfAccount.id);
final updateMachine = store.updateMachine!;
updateMachine.poll();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this poll call be here? It looks like it now duplicates one that UpdateMachine.load makes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed; thanks.

/// See also:
/// * [TestGlobalStore], which prepares per-account data
/// without using [UpdateMachine.load].
class UpdateMachineTestGlobalStore extends GlobalStore with _ApiConnectionsMixin, _DatabaseMixin {
Copy link
Member

@gnprice gnprice Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent some time staring at this vs. TestGlobalStore and thinking about the relationship between them. I didn't see any big changes I'd want to make; but the updateMachines field over on that other class was one thing that was a bit incongruous with the naming of the classes, and I realized it wasn't needed.

I'll push an extra commit that cuts that out, as well as a prep commit in a different area that this brought up:
dd9b29b test [nfc]: Simplify out TestGlobalStore.updateMachines
247bf7f store test [nfc]: Clarify what prepareReload is doing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great; thanks!

@gnprice gnprice force-pushed the pr-load-async-gaps-logout branch from fa6e8aa to 8405ffc Compare March 8, 2025 05:37
@chrisbobbe
Copy link
Collaborator Author

Thanks for the reviews! Revision pushed, but GitHub seems stuck and I'm not sure if it took—the branch should be at 1ea55a9.

@PIG208
Copy link
Member

PIG208 commented Mar 11, 2025

@gnprice
Copy link
Member

gnprice commented Mar 11, 2025

Weird — that's a long delay at this point (40 minutes since Chris's comment above).

Try pushing again (with some trivial change)? Possibly that'd unwedge it.

@PIG208 PIG208 assigned gnprice and unassigned PIG208 Mar 11, 2025
@PIG208 PIG208 added integration review Added by maintainers when PR may be ready for integration and removed maintainer review PR ready for review by Zulip maintainers labels Mar 11, 2025
@PIG208
Copy link
Member

PIG208 commented Mar 11, 2025

Looks good to me other than a nit. Thanks for the updates!

@gnprice gnprice force-pushed the pr-load-async-gaps-logout branch from 1ea55a9 to e0cd3a9 Compare March 11, 2025 20:25
@gnprice
Copy link
Member

gnprice commented Mar 11, 2025

I fetched 1ea55a9 from Chris's fork, made a no-op change, and pushed that. This time the push seems to have worked.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Generally this looks great. One added test I'd like to see, and otherwise nits.

Comment on lines 147 to 151
/// Load per-account data for the given account, unconditionally.
/// Load per-account data for the given account ID, unconditionally.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think "given account" is right — the per-account data is for the account, not for the account ID. (Or it's "for" the account ID only indirectly, via the account ID identifying an account.)

The "given" account is the one identified by the parameter accountId. The name says it's an ID for an account, so I think that doesn't require elaboration in the doc.

/// Load per-account data for the given account, unconditionally.
/// Load per-account data for the given account ID, unconditionally.
///
/// The account for `accountId` must exist.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is fine to add, though.)

Comment on lines 1041 to 1043
globalStore.prepareRegisterQueueResponse = (newConnection) {
prepareRegisterQueueResponse(newConnection);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
globalStore.prepareRegisterQueueResponse = (newConnection) {
prepareRegisterQueueResponse(newConnection);
};
globalStore.prepareRegisterQueueResponse = prepareRegisterQueueResponse;

Comment on lines +150 to +194
// no poll, server-emoji-data, or register-token requests
check(connection.takeRequests()).isEmpty();
check(connection).isOpen.isFalse();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make a positive version of these tests, to put just above these negative versions, for contrast?

Doesn't need to get into the details of the poll, server-emoji-data, etc. requests — just .length.equals(3) would be enough to make a contrast with this .isEmpty(). Plus checking isOpen is still true for a contrast with the isOpen-false check.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these tests

You mean a test where the loading succeeds, right, so just one test? That's the only condition where the poll, server-emoji-data, and register-token requests are made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, exactly.

Comment on lines 1056 to 1058
await prepareReload(async,
prepareRegisterQueueResponse: (newConnection) {
newConnection.prepare(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just call it connection as we typically do; also can make a bit more compact:

Suggested change
await prepareReload(async,
prepareRegisterQueueResponse: (newConnection) {
newConnection.prepare(
await prepareReload(async, prepareRegisterQueueResponse: (connection) {
connection.prepare(

Comment on lines 1015 to 1045
// [PerAccountStore.fromInitialSnapshot] requires the account
// to be in the global store when called; do so before logging out.
final newStore = eg.store(globalStore: globalStore, account: eg.selfAccount);
await logOutAccount(globalStore, eg.selfAccount.id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation this comment was explaining felt like a bit of a signal that our old test setup here wasn't quite simulating things right. Glad to see we get to delete it.

Comment on lines 919 to 920
/// Throws [AccountNotFoundException] after an async gap
/// if the account has been removed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Throws [AccountNotFoundException] after an async gap
/// if the account has been removed.
/// Throws [AccountNotFoundException] if after any async gap
/// the account has been removed.

Otherwise it sounds like it's saying: if the account has been removed, throws said exception, but first causes an async gap.

@chrisbobbe chrisbobbe force-pushed the pr-load-async-gaps-logout branch from e0cd3a9 to 6c12175 Compare March 12, 2025 00:33
@chrisbobbe
Copy link
Collaborator Author

Thanks for the reviews! Revision pushed.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Small comments below.

Comment on lines 246 to 248
/// Update an account in the store, returning the new version.
///
/// The account with the given account ID will be updated.
/// The account with the given account will be updated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙂

Comment on lines 162 to 163
final globalStore = UpdateMachineTestGlobalStore(
globalSettings: eg.globalSettings(), accounts: [eg.selfAccount]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably a good sign that the globalSettings parameter can be made optional on these test classes, defaulting to eg.globalSettings().

With maybe a prep commit to do that for TestGlobalStore. (FYI @PIG208 since we just introduced this in #1167.)

It differs from accounts in that almost every test that interacts with the store at all cares what set of tests exist on it, so there's no good default for that even in tests; but most tests don't care about the specific values of the settings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sent #1403 with that prep commit as a quick PR.

Comment on lines 162 to 170
final globalStore = UpdateMachineTestGlobalStore(
globalSettings: eg.globalSettings(), accounts: [eg.selfAccount]);
final connection = globalStore.apiConnectionFromAccount(eg.selfAccount) as FakeApiConnection;

NotificationService.instance.token = ValueNotifier('asdf');
addTearDown(NotificationService.debugReset);

final future = globalStore.perAccount(eg.selfAccount.id);
check(connection.takeRequests()).length.equals(1); // register request
await future;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's tighten the comparison with the tests below by rearranging slightly, like so:

Suggested change
final globalStore = UpdateMachineTestGlobalStore(
globalSettings: eg.globalSettings(), accounts: [eg.selfAccount]);
final connection = globalStore.apiConnectionFromAccount(eg.selfAccount) as FakeApiConnection;
NotificationService.instance.token = ValueNotifier('asdf');
addTearDown(NotificationService.debugReset);
final future = globalStore.perAccount(eg.selfAccount.id);
check(connection.takeRequests()).length.equals(1); // register request
await future;
NotificationService.instance.token = ValueNotifier('asdf');
addTearDown(NotificationService.debugReset);
final globalStore = UpdateMachineTestGlobalStore(
globalSettings: eg.globalSettings(), accounts: [eg.selfAccount]);
final connection = globalStore.apiConnectionFromAccount(eg.selfAccount) as FakeApiConnection;
final future = globalStore.perAccount(eg.selfAccount.id);
check(connection.takeRequests()).length.equals(1); // register request
await future;

gnprice added a commit to gnprice/zulip-flutter that referenced this pull request Mar 12, 2025
This has a natural default, at least in tests, because most tests
don't care about the specific value of the global settings.

(By contrast most tests that interact with the store at all will
care what set of accounts exist, so there isn't as natural a default
for `accounts`.)

Originally noticed this here:
  zulip#1386 (comment)
chrisbobbe pushed a commit that referenced this pull request Mar 12, 2025
This has a natural default, at least in tests, because most tests
don't care about the specific value of the global settings.

(By contrast most tests that interact with the store at all will
care what set of accounts exist, so there isn't as natural a default
for `accounts`.)

Originally noticed this here:
  #1386 (comment)
I believe this is effectively already done by the existing assert
`_perAccountStores.containsKey`, because of the invariant that
_accounts contains all the keys in _perAccountStores. But the
explicitness seems helpful. (And loadPerAccount relies on the
account existing.)
In particular:
  GlobalStore.loadPerAccount
  GlobalStore.doLoadPerAccount
  UpdateMachine.load

The first two already have an `assert` that the account exists, and
the third has a ! null-check. They're called in a chain from two
places, and we have the account in both: the initial account load
and the reload (GlobalStore._reloadPerAccount).
…tId]

In GlobalStore.perAccount, the loadPerAccount future can throw, and
in that case the `_perAccountStoresLoading.remove` in these lines of
code hasn't been getting called.

That's only a latent bug, currently, because loadPerAccount only
throws if the account was logged out, and we always do a
`_perAccountStoresLoading.remove` on logout (in
GlobalStore.removeAccount).

But we'd like to add another case where the loadPerAccount future
throws (disallow connecting to ancient servers), which doesn't
involve logout / removeAccount. So this try/finally will be
important for that case.
For zulip#1354, we'll need an alternative to TestGlobalStore that runs
UpdateMachine.load with prepared API responses instead of mocking
it. Since FakeApiConnection doesn't yet offer a `Completer`-based
way of completing/awaiting requests, these two tests will have to
diverge, preparing and awaiting different delays as needed.
chrisbobbe and others added 5 commits March 12, 2025 16:47
Soon, we'll add an alternative to TestGlobalStore whose
doLoadPerAccount makes fake API requests; its doLoadPerAccount will
call UpdateMachine.load. These methods will be useful for that new
class.

(The new class makes sense as a sibling of TestGlobalStore, rather
than a subclass, because TestGlobalStore's dartdoc says it makes no
network requests.)
Thanks to the introduction of PerAccountStore.updateMachine
in be6698e, this feature in the test code is no longer needed.
@chrisbobbe chrisbobbe force-pushed the pr-load-async-gaps-logout branch from 6c12175 to 0922ca5 Compare March 12, 2025 23:48
@chrisbobbe
Copy link
Collaborator Author

Thanks for the review! Revision pushed.

@gnprice gnprice merged commit 0922ca5 into zulip:main Mar 13, 2025
1 check passed
@gnprice
Copy link
Member

gnprice commented Mar 13, 2025

Thanks! Looks good; merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration review Added by maintainers when PR may be ready for integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Runtime error if user logs out when reloading PerAccountStore
3 participants