Skip to content

Loading indicator does not go away when the event queue appears to be active #979

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
PIG208 opened this issue Oct 3, 2024 · 2 comments
Labels
a-api Implementing specific parts of the Zulip server API a-sync Event queue; retry; local echo; races

Comments

@PIG208
Copy link
Member

PIG208 commented Oct 3, 2024

We added a loading indicator in #852 that appears when even queue polling fails with any error.

However, it seems that the loading indicator can stay on even the event queue is doing fine, and does not go away until you get a new event.

I don't have a consistent way to reproduce this, but what I did was basically:

  1. Leave the app on and lock the phone
  2. After a while, unlock the phone
  3. Observe if the loading indicator stays there longer than usual
  4. Try to do something (like starring a message), and the loading indicator should go away promptly

My guess is that when the event queue goes stale, isLoading gets set to true. When we do get the new event queue, we do not set isLoading back to false.

@PIG208 PIG208 added the a-api Implementing specific parts of the Zulip server API label Oct 3, 2024
@PIG208 PIG208 added this to the Beta 4: Fall 2024 milestone Oct 3, 2024
satyam372 added a commit to satyam372/zulip-flutter that referenced this issue Oct 20, 2024
satyam372 added a commit to satyam372/zulip-flutter that referenced this issue Oct 20, 2024
@gnprice gnprice added the a-sync Event queue; retry; local echo; races label Oct 22, 2024
@gnprice
Copy link
Member

gnprice commented Oct 22, 2024

Thanks for the report!

I don't think I've seen this symptom myself yet. It sounds reminiscent of this other event-queue-related symptom I saw once back in July:

but I haven't seen that one since then either.

I think this won't be actionable until we hear more reports of it happening, so I'll push it out to the post-launch milestone. If we do see or hear of it again, this thread will help us aggregate those reports.

@gnprice gnprice modified the milestones: M6: Post-launch, M7: Future Nov 21, 2024
@gnprice
Copy link
Member

gnprice commented Nov 23, 2024

I happened to see this symptom yesterday during development, and thanks to seeing the debug log I think I have a diagnosis. This was on my Pixel 8 running Android 15, while connected by USB to my desktop for debugging.

What happens is:

  • The device goes to sleep, perhaps while the app was in the foreground.
  • The app actually keeps running — one can tell because it keeps printing log messages to the console.
  • But it isn't able to make network requests.
    • The symptom of this is that the requests fail with SocketException, generally with DNS errors that it can't find chat.zulip.org (or whatever the host is).
      • We'd previously seen that those errors are common; that's why they're the one type of transient failure we silenced from reporting to the user, in Show poll-failure details #868.
    • In retrospect this makes sense as a compromise for the OS to choose: let the foreground app keep running, at least for a little while, in case the user wakes the device right back up again; but disable the network, because the wifi radio and especially the cell radio are expensive for the battery.
  • So our event-poll loop keeps getting failures, and backs off farther and farther. So far, so good.
  • Now the user wakes the device, and the app is able to use the network again. Two things go wrong that both contribute to this symptom.
  • First, we may be in the middle of a long backoff. So it could be up to 10 seconds (typically more like 3 seconds) before we make the next request. During the remaining backoff time, the loading indicator will continue; and we won't learn of new events, even events that already happened while we were offline.
    • This first issue, I was basically already aware of. It'd be good to fix but isn't the main cause of the symptom this thread is about.
  • Then, once the current backoff expires and we do make a new request… that request may take up to a minute to return, even if the network and server are functioning ideally. Until the request completes, we keep the loading indicator going.
    • This is the new point which I hadn't thought of before I saw it happening yesterday.
    • Specifically this happens if there are no events waiting for us, and no new events show up. After all, our event-polling is a long-poll — the point of which is that the server may sit on the request a long time, so that when it does have something to say it can send a response immediately.
      • The only reason this ends even at a minute is because the server sends a "heartbeat" event just for the sake of making a response, as a workaround for some network intermediaries breaking connections that sit silently for longer than that.
    • That's why starring a message stops the loading indicator: it generates an event, causing the server to send a response.
    • While this issue is active, the app's data is actually up to date — if there were news for us, the server would have responded with it already. So the only problem is that we're showing the loading indicator, making it look out of date. In that sense it's milder than the first issue above.
    • On the other hand, this one can last much longer — it'll routinely be almost one minute, if the realm isn't busy. So in that sense it's more severe.

So, now that we have a diagnosis, moving this issue earlier. I think it's still a post-launch issue, though — I wouldn't hold up shipping the app for it.

@gnprice gnprice modified the milestones: M7: Future, M6: Post-launch Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a-api Implementing specific parts of the Zulip server API a-sync Event queue; retry; local echo; races
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

2 participants