-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(ui): Fix performance of TimelineEventHandler::deduplicate_local_timeline_item
#4601
fix(ui): Fix performance of TimelineEventHandler::deduplicate_local_timeline_item
#4601
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4601 +/- ##
==========================================
+ Coverage 85.72% 85.73% +0.01%
==========================================
Files 291 291
Lines 33288 33292 +4
==========================================
+ Hits 28536 28543 +7
+ Misses 4752 4749 -3 ☔ View full report in Codecov by Sentry. |
d8ac3b5
to
f6f7ea2
Compare
…timeline_item`. This patch drastically improves the performance of `TimelineEventHandler::deduplicate_local_timeline_item`. Before this patch, `rfind_event_item` was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found. The problem is the following: all items are traversed. However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met. This patch updates `TimelineEventHandler::deduplicate_local_timeline_item` to replace to use of `rfind_event_item` by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate. To do so, [`Iterator::try_fold`] is probably the best companion. [`Iterator::try_find`] would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods in `Iterator` are using `try_fold`, like `find` or any other methods that need to do a “short-circuit”. Anyway, `try_fold` works pretty nice here, and does exactly what we need. Our use of `try_fold` is to return a `ControlFlow<Option<(usize, TimelineItem)>, ()>`. After `try_fold`, we call `ControlFlow::break_value`, which returns an `Option`. Hence the need to call `Option::flatten` at the end to get a single `Option` instead of having an `Option<Option<(usize, TimelineItem)>>`. I'm testing this patch with the `test_lazy_back_pagination` test in matrix-org#4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. This `deduplicate_local_timeline_item` method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small. [`Iterator::try_fold`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_fold [`Iterator::try_find`]: https://doc.rust-lang.org/std/iter/trait.Iterator.html#method.try_find
32abd82
to
e8edb9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the rust-version bump if you use find_map()
, otherwise lgtm.
// Indeed, it is a waste of time to iterate over all items in `items`. Local event | ||
// timeline items are necessarily at the end of `items`: as soon as they have been | ||
// iterated, we can stop the entire iteration. | ||
.try_fold((), |(), (nth, timeline_item)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use a find_map()
and not have to bump the minimum rust version, and it would be a bit more noob-friendly :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterator::find_map
is not appropriate here. It returns the first non-None
value. It means that we cannot differentiate the break the loop because we've found a value, vs. break the loop because no value can be found.
We need to stop the iteration when all local event timeline items have been traversed. This is not possible to represent that with Iterator::find_map
. That's exactly the problem this patch is trying to fix :-).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a plain loop then… 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That, that's a good question ;-).
This patch adds matrix-org#4601, matrix-org#4608, matrix-org#4612 and matrix-org#4616 in their respective `CHANGELOG.md`s. +
This patch adds matrix-org#4601, matrix-org#4608, matrix-org#4612 and matrix-org#4616 in their respective `CHANGELOG.md`s.
This patch drastically improves the performance of
TimelineEventHandler::deduplicate_local_timeline_item
.Before this patch,
rfind_event_item
was used to iterate over all timeline items: for each item in reverse order, if it was an event timeline item, and if it was a local event timeline item, and if it was matching the event ID or transaction ID, then a duplicate was found.The problem is the following: all items are traversed.
However, local event timeline items are always at the back of the items. Even virtual timeline items are before local event timeline items. Thus, it is not necessary to traverse all items. It is possible to stop the iteration as soon as (i) a non event timeline item is met, or (ii) a non local event timeline item is met.
This patch updates
TimelineEventHandler::deduplicate_local_timeline_item
to replace to use ofrfind_event_item
by a custom iterator that stops as soon as a non event timeline item, or a non local event timeline item, is met, or —of course— when a local event timeline item is a duplicate.To do so,
Iterator::try_fold
is probably the best companion.Iterator::try_find
would have been nice, but it is available on nightlies, not on stable versions of Rust. However, many methods inIterator
are usingtry_fold
, likefind
or any other methods that need to do a “short-circuit”. Anyway,try_fold
works pretty nice here, and does exactly what we need.Our use of
try_fold
is to return aControlFlow<Option<(usize, TimelineItem)>, ()>
. Aftertry_fold
, we callControlFlow::break_value
, which returns anOption
. Hence the need to callOption::flatten
at the end to get a singleOption
instead of having anOption<Option<(usize, TimelineItem)>>
.I'm testing this patch with the
test_lazy_back_pagination
test in #4594. With 10_000 events in the sync, the test was taking 13s to run on my machine. With this patch, it takes 10s to run. It's a 23% improvement. Thisdeduplicate_local_timeline_item
method was taking a large part of the computation according to the profiler. With this patch, this method is barely visible in the profiler it is so small.Flamegraph of the
TimelineEventHandler::add_item
method (that callsdeduplicate_local_timeline_item
), before this patch:And after:
EventCache
storage #3280.