Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a leak in HttpEncodedResponse #5858

Merged
merged 5 commits into from
Aug 8, 2024

Conversation

ikhoon
Copy link
Contributor

@ikhoon ikhoon commented Aug 5, 2024

Motivation:

An HttpData produced in HttpEncodedResponse.beforeComplete() is not collected by CollectingSubscriberAndSubscription but is leaked.

  Hint: {10B, pooled, <unknown>}
  com.linecorp.armeria.common.HttpData.wrap(HttpData.java:110)
  com.linecorp.armeria.server.encoding.HttpEncodedResponse.beforeComplete(HttpEncodedResponse.java:163)
  com.linecorp.armeria.common.stream.FilteredStreamMessage.lambda$collect$0(FilteredStreamMessage.java:201)
  java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
  java.base/java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:950)
  java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2340)
  com.linecorp.armeria.common.stream.FilteredStreamMessage.collect(FilteredStreamMessage.java:142)

CollectingSubscriberAndSubscription was designed to only apply filter() to the upstream.collect(). I didn't consider that an object could be published via onNext() in beforeComplete(). The purpose of CollectingSubscriberAndSubscription was to provide an optimized code path for unary calls. it didn't seem the code provides a trivial performance improvement but the implementation was complex and error-prone.

I was able to fix the code not to leak the data but I didn't want to additional complexity to it. It might be cleaner to use the Reactive Streams API instead of keeping the custom collect() implementation. There will be no change in performance for normal message sizes.

Modifications:

  • Remove the custom collect() implemtation in FilteredStreamMessage.

Result:

Fix a potential leak when sending compressed responses.

Motivation:

An `HttpData` produced in `HttpEncodedResponse.beforeComplete()` is not
collected by `CollectingSubscriberAndSubscription` and leaked.

```java
  Hint: {10B, pooled, <unknown>}
  com.linecorp.armeria.common.HttpData.wrap(HttpData.java:110)
  com.linecorp.armeria.server.encoding.HttpEncodedResponse.beforeComplete(HttpEncodedResponse.java:163)
  com.linecorp.armeria.common.stream.FilteredStreamMessage.lambda$collect$0(FilteredStreamMessage.java:201)
  java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
  java.base/java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:950)
  java.base/java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2340)
  com.linecorp.armeria.common.stream.FilteredStreamMessage.collect(FilteredStreamMessage.java:142)
```

`CollectingSubscriberAndSubscription` was designed to only apply
`filter()` of the `upstream.collect()`. I didn't consider that an object
could be published via `onNext()` in `beforeComplete()`.

`CollectingSubscriberAndSubscription` was added to provide an optimized
code path for unary calls. it doesn't seem the code provides a
trival performance improvement but the implemetation is complex and
error-prone.

I was able to fix the code not to leak the data but I didn't want to
additional complexity to it. It might be cleaner to use the Reactive
Streams API instead of keeping the custom `collect()` implemetation.
There will be no change in performance for normal message sizes.

Modifications:

- Remove the custom `collect()` implemtation in `FilteredStreamMessage`.

Result:

Fix a potential leak when sending compressed responses.
@ikhoon ikhoon added the defect label Aug 5, 2024
@ikhoon ikhoon added this to the 1.30.0 milestone Aug 5, 2024
@ikhoon ikhoon mentioned this pull request Aug 5, 2024
@ikhoon ikhoon marked this pull request as draft August 6, 2024 04:20
@@ -752,6 +754,11 @@ default CompletableFuture<List<T>> collect(EventExecutor executor, SubscriptionO
requireNonNull(executor, "executor");
requireNonNull(options, "options");
final StreamMessageCollector<T> collector = new StreamMessageCollector<>(options);
if (!containsNotifyCancellation(options)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before setting this PR as ready for review, it would be easier to review if you explain why this change is needed in the PR description

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StreamMessageCollector collects the published elements until onComplete() or onError() is called. It generally worked but I found a corner case where Subscription.cancel() made collector.collect() incomplete forever.

You can reproduce it by disabling this block and then running it below.

I believe it would make more sense to specify NOTIFY_CANCELLATION for collect() method because it is used to fully consume the upstream data instead of partial ones.
CancelledSubscriptionException will make the collecting future completed exceptionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, previously, the custom collect() implementation in FilteredStreamMessage hooked a Subscription.cancel() event and completed the collecting future with the accumulated items so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I understood the issue now. Agree with the approach 👍

onError(ex);
upstream.cancel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in general, do we always cancel the upstream before propagating the error downstream now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we cancel the upstream first, cancel() may recursively call downstream.onError(CancelledSubscriptionException) which could prevent the error from being propagated to downstream.

So I think onError(ex) should be invoked first to propagate the cause to the downstream that will be eventually passed to ServerErrorHandler and RequestLog.*Cause()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this change is unrelated to the leak, am I understanding correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I found the bug while fixing broken tests after removing CollectingSubscriberAndSubscription and related code.

@ikhoon ikhoon marked this pull request as ready for review August 7, 2024 04:45
Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other changes look good, left a question about the early return in onComplete

@@ -752,6 +754,11 @@ default CompletableFuture<List<T>> collect(EventExecutor executor, SubscriptionO
requireNonNull(executor, "executor");
requireNonNull(options, "options");
final StreamMessageCollector<T> collector = new StreamMessageCollector<>(options);
if (!containsNotifyCancellation(options)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I understood the issue now. Agree with the approach 👍

onError(ex);
upstream.cancel();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this change is unrelated to the leak, am I understanding correctly?

if (completed) {
// onError(Throwable) or onComplete() has been called in filter().
StreamMessageUtil.closeOrAbort(filtered);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understånd this change 😅 Shouldn't the last filtered object be also passed downstream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not publish items if after calling onError() or onComplete()`.
https://github.com/reactive-streams/reactive-streams-jvm#1.7

Otherwise, did you mean something else?

Copy link
Contributor

@jrhee17 jrhee17 Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagined FilteredStreamMessage#beforeComplete publishing an item via:

completed = true;
try {
beforeComplete(delegate);

where completed=true is set but onNext is still called in the following scenario:

I thought this was the scenario this PR was trying to address

Copy link
Contributor Author

@ikhoon ikhoon Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed this to pass StreamMessageCollectingTest.filteredStreamMessage_cancel().

protected HttpData filter(HttpData obj) {
count++;
if (count < 3) {
return obj;
} else {
subscription.cancel();

subscription.cancel() made upstream signal subscriber.onError(CancelledSubscriptionException). Afterward, subscribe.onNext() was called.

It's impossible to know what all subclasses do. So we may need to prevent onNext() from being called by beforeComplete() or beforeOnError().

@Override
public void onNext(T o) {
    if (complete) {
        return;
    }

    U filtered;
    try {
        filtered = filter(o);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, only the Subscription API of FilteringSubscriber is exposed to the subclasses. It seems FilteringSubscriber.onNext() couldn't be called by beforeCompete()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, only the Subscription API of FilteringSubscriber is exposed to the subclasses.

I just realized that the downstream Subscriber#onNext is called instead of FilteringSubscriber#onNext. I guess the current implementation should be fine then 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not nice that a Subscriber is exposed through beforeComplete(). We may need to redesign the API later.

Copy link
Contributor

@minwoox minwoox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍

@@ -298,17 +219,21 @@ public void onNext(T o) {
try {
filtered = filter(o);
} catch (Throwable ex) {
StreamMessageUtil.closeOrAbort(o);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to call StreamMessageUtil.closeOrAbort(filtered);?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't filtered null when we reach here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, I found a case where o was double-released when a maximum length was exceeded. Calling StreamMessageUtil.closeOrAbort(o) didn't make sense since the ownership has been transferred to filter() method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct. 👍 The previous logic was wrong.

Copy link
Contributor

@jrhee17 jrhee17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍

@ikhoon ikhoon merged commit c1d5475 into line:main Aug 8, 2024
15 checks passed
@ikhoon ikhoon deleted the fix-leak-in-HttpEncodedResponse branch August 8, 2024 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants