-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BREAKING] HTTP.jl 2.0: Overhaul internals by switching to use aws-crt libraries #1213
base: master
Are you sure you want to change the base?
Conversation
Made some progress here, but blocked on 2 issues I need to dig into:
|
Ok, turns out the segfault is due to Base.BufferStream |
Creates a mutable struct wrapper for each default resource so we can add appropraite finalizers. Also adds a convenience close_ method for each resource that will finalize the current resource and reset it to "null". This machinery is necessary for precompile workloads to appropriately do a total cleanup of all process resources (as discovered in JuliaWeb/HTTP.jl#1213)
Creates a mutable struct wrapper for each default resource so we can add appropraite finalizers. Also adds a convenience close_ method for each resource that will finalize the current resource and reset it to "null". This machinery is necessary for precompile workloads to appropriately do a total cleanup of all process resources (as discovered in JuliaWeb/HTTP.jl#1213)
close_all_clients!() | ||
close_default_aws_server_bootstrap!() | ||
close_default_aws_client_bootstrap!() | ||
close_default_aws_host_resolver!() | ||
close_default_aws_event_loop_group!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could a new event loop group, etc. be created instead of using the default? I wouldn't want this package to conflict with others because they both use default resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, it's better that they do share the same event loop group. The point is that the default group should be able to saturate os resources, so it's better that packages/libraries coordinate tasks in the same group instead of having competing groups then possibly scheduling tasks in a non-optimal configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, but if an event loop group is shared, who is responsible for closing it? It could be closed by one user when another user still needs it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the names of these functions is confusing because AFAIU the aws calls are referencing/de-referencing them, so it closes when the counter reaches zero. So confusing to say close
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point, but if an event loop group is shared, who is responsible for closing it? It could be closed by one user when another user still needs it.
In my mind, (and it would probably be helpful to document this perhaps in the LibAwsIO README or something), there are 2 different scenarios for LibAwsIO usage:
- Casual, REPL usage, scripts, etc. In this case, you may load libraries (Postgres.jl, HTTP.jl, etc.) where you want things to "just work" and do sensible default things. That's currently the case w/ the machinery here. Sensible default resources are set up, they can be used, we're not too worried about cleaning things up because everything will be taken care of when the process exits.
- Private application. In this case, again, you load libraries or perhaps use LibAwsIO directly, but you may have unique, specific settings you want to set for these LibAwsIO resources. You're free to create your own (not too hard, but takes some manual setup/effort), and then you can set them in LibAwsIO so that any package that relies on them will use the same, manually set up resources you'd like. This is as opposed to needing to pass the resources manually, for example, to every HTTP call you may have in a huge, nested application. Those kinds of one-off custom resources should still be supported, so if you had a specific HTTP request that you wanted to use slightly different resource group, you could pass them to that call specifically.
So to circle back and answer the original question: if you're relying on the default shared resources, then you shouldn't be closing them or worrying about closing them. They're global shared. If you're in a situation where you need to close them, then you should probably be initializing them yourself and passing them or setting them as the defaults yourself, and then you're free to close them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the names of these functions is confusing because AFAIU the aws calls are referencing/de-referencing them, so it closes when the counter reaches zero. So confusing to say
close
?
Hmmmm.....perhaps. But we are specifically calling this close_*default*_...
which we are specifically interested in closing the default resources that have been set here. So yes, if you happen to have your own references to those resources, then calling these will prematurely finalize then and you wouldn't want to use them afterwards.
We could perhaps compose these a bit better with individual close
methods on, e.g. the EventLoopGroup
and then these close defaults could use that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the crt orchestrate this with reference counting? https://github.com/awslabs/aws-c-io/blob/3041dabfc13fe9bc9a0467e15aa1d5a09c7fc06f/include/aws/io/event_loop.h#L169-L174
It looks like these methods were just missed while overloading for BufferStream. There's also `readbytes!` where the current implementation will fallback to the `LibuvStream` implementation that is currently not threadsafe. What's the best approach there since the implementation is quite a bit more involved? Just duplicate the code but for BufferStream? Should we take the BufferStream lock and invoke the LibuvStream method? Open to ideas there. Also open to suggestions for having tests here? Not easy to simulate the data race of writing and calling readavailable. The fix here will unblock JuliaWeb/HTTP.jl#1213 (I'll probably do some compat shim there until this is fully released). Thanks to @oscardssmith for rubber ducking this issue with me. Probably most helpfully reviewed by @vtjnash. --------- Co-authored-by: Jameson Nash <[email protected]>
It looks like these methods were just missed while overloading for BufferStream. There's also `readbytes!` where the current implementation will fallback to the `LibuvStream` implementation that is currently not threadsafe. What's the best approach there since the implementation is quite a bit more involved? Just duplicate the code but for BufferStream? Should we take the BufferStream lock and invoke the LibuvStream method? Open to ideas there. Also open to suggestions for having tests here? Not easy to simulate the data race of writing and calling readavailable. The fix here will unblock JuliaWeb/HTTP.jl#1213 (I'll probably do some compat shim there until this is fully released). Thanks to @oscardssmith for rubber ducking this issue with me. Probably most helpfully reviewed by @vtjnash. --------- Co-authored-by: Jameson Nash <[email protected]> (cherry picked from commit ffc96bc)
It looks like these methods were just missed while overloading for BufferStream. There's also `readbytes!` where the current implementation will fallback to the `LibuvStream` implementation that is currently not threadsafe. What's the best approach there since the implementation is quite a bit more involved? Just duplicate the code but for BufferStream? Should we take the BufferStream lock and invoke the LibuvStream method? Open to ideas there. Also open to suggestions for having tests here? Not easy to simulate the data race of writing and calling readavailable. The fix here will unblock JuliaWeb/HTTP.jl#1213 (I'll probably do some compat shim there until this is fully released). Thanks to @oscardssmith for rubber ducking this issue with me. Probably most helpfully reviewed by @vtjnash. --------- Co-authored-by: Jameson Nash <[email protected]> (cherry picked from commit ffc96bc)
This isn't quite ready to merge yet, but starting the PR to get CI running and document what there is left to do:
[ ] Incorporate @IanButterworth's recent precompile work
[ ] Update documentation: manual and all inline docs; also write up a 1.X -> 2.0 migration guide
[ ] Update upstream AWS CRT libraries and fix anything required in our wrappers