-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce string allocations for literals [skip CI] #1590
Conversation
The remaining work here is dealing with the fallout from type mismatches from |
10c4f54
to
a014630
Compare
a014630
to
2a2251e
Compare
I slightly reworked this since while we can emit C++ literals for generated (effectively static) strings user-specified literals should still be emitted as mutable strings (e.g., so users can call mutating functions on them). The way I now detect the difference is to check whether a literal has a location. |
bff2b0e
to
f0b6fb3
Compare
I added a couple of other perf cleanups to this PR, please let me know if I should pull them into their own PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main comment is that "magic" part: not a fan. Rest looks good.
f0b6fb3
to
5fad5b4
Compare
43880db
to
5315a54
Compare
@@ -1,5 +1,7 @@ | |||
// Copyright (c) 2020-2023 by the Zeek Project. See LICENSE for details. | |||
|
|||
#include "spicy/rt/sink.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#include <spicy/rt/sink.h>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though we still do not do this consistently, I would really like to include this with #include "..."
to make clear that this is the header related to this .cc file. With that we e.g., get it to sort before other headers which helps make sure that the header is self-contained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we had this discussion a few times already. :-) I won't fight it, I just don't like the inconsistency of doing it randomly for a few implementation files, but not most others.
When waiting for input we pass down strings for a possible error message and the triggering location. In generated code these are always literals. With this patch we do not take them as owning strings, but instead as views into existing strings to minimize allocations. In the case of error messages the created low-level exception objects already had used string_views, so this also aligns the APIs. Closes #1589.
When emitting literals for HILTI strings (string ctors) we would previously explicitly force creation of `std::string`. This was almost always an unnecessary pessimisation over emitting string literals since even if their C++ uses expected `std::string` string literals can convert to this type implicitly; at the same time it made it impossible to make effective use of APIs accepting `std::string_view`. With this patch we now emit C++ string literals for HILTI string literals.
The safe iterator for bytes dynamically allocates which can cause overhead. Use index-based or at least string iterators to reduce that overhead where possible.
For a chunk of significant size, seems that would result in an extra malloc and copy of the input data.
When copying a full view into a Bytes instance, avoid potential reallocations and memcpy() by pre-allocating enough capacity in the underlying string.
e696c78
to
cb43a7a
Compare
With this PR the benchmark from #1589 shows identical performance for both cases for me. Overall runtime seems to improve as well.
Before:
After:
Closes #1589.
Closes #1591.