Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/topic/bbannier/issue-1589'
Browse files Browse the repository at this point in the history
  • Loading branch information
bbannier committed Dec 6, 2023
2 parents 0ccdf3f + cb43a7a commit 66984e2
Show file tree
Hide file tree
Showing 44 changed files with 523 additions and 361 deletions.
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,14 @@ repos:
stages: ["commit"]

- repo: https://github.com/pre-commit/mirrors-clang-format
rev: 'v16.0.6'
rev: 'v17.0.6'
hooks:
- id: clang-format
types_or: ["c", "c++"]
stages: ["commit"]

- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
rev: v4.5.0
hooks:
- id: trailing-whitespace
exclude: '^tests/Baseline'
Expand Down Expand Up @@ -86,7 +86,7 @@ repos:
stages: ["commit"]

- repo: https://github.com/crate-ci/typos
rev: v1.16.13
rev: v1.16.23
hooks:
- id: typos
exclude: 'tests/.*/regexp/.*'
Expand Down
90 changes: 90 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -1,3 +1,93 @@
1.10.0-dev.38 | 2023-12-06 12:28:27 +0100

* Avoid reallocating Bytes when appending from view (Arne Welzel, Corelight)

When copying a full view into a Bytes instance, avoid potential
reallocations and memcpy() by pre-allocating enough capacity in
the underlying string.

* Allocate vector of correct size right away when constructing Chunks. (Benjamin Bannier, Corelight)

* Avoid one extra std::string copy in Stream::append. (Arne Welzel, Corelight)

For a chunk of significant size, seems that would result in an extra
malloc and copy of the input data.

* Reduce safe iterator use in internal code for `Bytes`. (Benjamin Bannier, Corelight)

The safe iterator for bytes dynamically allocates which can cause
overhead. Use index-based or at least string iterators to reduce that
overhead where possible.

* Use unsafe iterators in `View::extract`. (Benjamin Bannier, Corelight)

* Avoid unneeded parameter copy in parse functions. (Benjamin Bannier, Corelight)

* Streamline `View::firstBlock`. (Benjamin Bannier, Corelight)

* Reduce allocations when connecting sinks by mime-type. (Benjamin Bannier, Corelight)

* Reduce allocations for `Sink` debug logging. (Benjamin Bannier, Corelight)

* Deprecate `builder::string`. (Benjamin Bannier, Corelight)

* Reduce allocations when creating exceptions. (Benjamin Bannier, Corelight)

* Reduce allocations in `builder::addAssert`. (Benjamin Bannier, Corelight)

* Reduce allocations in `ParserBuilder::waitForInput`. (Benjamin Bannier, Corelight)

* Reduce allocations in `Builder::startProfiler`. (Benjamin Bannier, Corelight)

* Reduce allocatons in `Builder::addDebug*` methods. (Benjamin Bannier, Corelight)

* Reduce allocations in `ParserBuilder::parseError`. (Benjamin Bannier, Corelight)

* Reducing copying when forwarding data to sinks. (Benjamin Bannier, Corelight)

* Reduce lookup overhead in indent/dedent logger functions. (Benjamin Bannier, Corelight)

* Emit C++ string literals for HILTI string literals. (Benjamin Bannier, Corelight)

When emitting literals for HILTI strings (string ctors) we would
previously explicitly force creation of `std::string`. This was almost
always an unnecessary pessimisation over emitting string literals since
even if their C++ uses expected `std::string` string literals can
convert to this type implicitly; at the same time it made it impossible
to make effective use of APIs accepting `std::string_view`.

With this patch we now emit C++ string literals for HILTI string
literals.

* Emit locations as generated strings. (Benjamin Bannier, Corelight)

* Avoid allocations when creating parsers. (Benjamin Bannier, Corelight)

* Use non-owning strings for `fmt`. (Benjamin Bannier, Corelight)

* Use non-owning strings for `printParserState`. (Benjamin Bannier, Corelight)

* GH-1591: Use non-owning strings in the logging framework. (Benjamin Bannier, Corelight)

Closes #1591.

* Add `to_string_for_print` for string literals. (Benjamin Bannier, Corelight)

* GH-1589: Reduce string allocations on hot parse path. (Benjamin Bannier, Corelight)

When waiting for input we pass down strings for a possible error message
and the triggering location. In generated code these are always
literals.

With this patch we do not take them as owning strings, but instead as
views into existing strings to minimize allocations. In the case of
error messages the created low-level exception objects already had used
string_views, so this also aligns the APIs.

Closes #1589.

* Bump pre-commit hooks (Benjamin Bannier, Corelight)

1.10.0-dev.10 | 2023-12-06 10:40:22 +0100

* Allow unsafe Vector iteration over underlying std::vector. (Arne Welzel, Corelight)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.10.0-dev.10
1.10.0-dev.38
25 changes: 13 additions & 12 deletions hilti/runtime/include/debug-logger.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#include <map>
#include <memory>
#include <optional>
#include <string>
#include <string_view>

#include <hilti/rt/filesystem.h>
#include <hilti/rt/util.h>
Expand All @@ -18,20 +18,21 @@ class DebugLogger {
public:
DebugLogger(hilti::rt::filesystem::path output);

void print(const std::string& stream, const std::string& msg);
void enable(const std::string& streams);
void print(std::string_view stream, std::string_view msg);
void enable(std::string_view streams);

bool isEnabled(const std::string& stream) { return _streams.find(stream) != _streams.end(); }
bool isEnabled(std::string_view stream) { return _streams.find(stream) != _streams.end(); }

void indent(const std::string& stream) {
if ( isEnabled(stream) )
_streams[stream] += 1;
void indent(std::string_view stream) {
if ( auto s = _streams.find(stream); s != _streams.end() ) {
auto& indent = s->second;
indent += 1;
}
}

void dedent(const std::string& stream) {
if ( isEnabled(stream) ) {
auto& indent = _streams[stream];

void dedent(std::string_view stream) {
if ( auto s = _streams.find(stream); s != _streams.end() ) {
auto& indent = s->second;
if ( indent > 0 )
indent -= 1;
}
Expand All @@ -41,7 +42,7 @@ class DebugLogger {
hilti::rt::filesystem::path _path;
std::ostream* _output = nullptr;
std::unique_ptr<std::ofstream> _output_file;
std::map<std::string, integer::safe<uint64_t>> _streams;
std::map<std::string_view, integer::safe<uint64_t>> _streams;
};

} // namespace hilti::rt::detail
13 changes: 7 additions & 6 deletions hilti/runtime/include/exception.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#include <ostream>
#include <stdexcept>
#include <string>
#include <string_view>
#include <utility>
#include <vector>

Expand All @@ -25,7 +26,7 @@ class Exception : public std::runtime_error {
/**
* @param desc message describing the situation
*/
Exception(const std::string& desc) : Exception(Internal(), "Exception", desc) {
Exception(std::string_view desc) : Exception(Internal(), "Exception", desc) {
#ifndef NDEBUG
_backtrace = Backtrace();
#endif
Expand Down Expand Up @@ -79,11 +80,11 @@ class Exception : public std::runtime_error {
protected:
enum Internal {};

Exception(Internal, const char* type, const std::string& desc);
Exception(Internal, const char* type, std::string_view desc);
Exception(Internal, const char* type, std::string_view desc, std::string_view location);

private:
Exception(Internal, const char* type, const std::string& what, std::string_view desc, std::string_view location);
Exception(Internal, const char* type, std::string_view what, std::string_view desc, std::string_view location);

std::string _description;
std::string _location;
Expand All @@ -95,7 +96,7 @@ inline std::ostream& operator<<(std::ostream& stream, const Exception& e) { retu
#define HILTI_EXCEPTION(name, base) \
class name : public ::hilti::rt::base { \
public: \
name(const std::string& desc) : base(Internal(), #name, desc) {} \
name(std::string_view desc) : base(Internal(), #name, desc) {} \
name(std::string_view desc, std::string_view location) : base(Internal(), #name, desc, location) {} \
virtual ~name(); /* required to create vtable, see hilti::rt::Exception */ \
protected: \
Expand All @@ -105,7 +106,7 @@ inline std::ostream& operator<<(std::ostream& stream, const Exception& e) { retu
#define HILTI_EXCEPTION_NS(name, ns, base) \
class name : public ns::base { \
public: \
name(const std::string& desc) : base(Internal(), #name, desc) {} \
name(std::string_view desc) : base(Internal(), #name, desc) {} \
name(std::string_view desc, std::string_view location) : base(Internal(), #name, desc, location) {} \
virtual ~name(); /* required to create vtable, see hilti::rt::Exception */ \
protected: \
Expand Down Expand Up @@ -237,7 +238,7 @@ class WouldBlock : public std::runtime_error {
* @param desc message describing the situation
* @param location string indicating the location of the operation that couldn't complete
*/
WouldBlock(const std::string& desc, const std::string& location);
WouldBlock(std::string_view desc, std::string_view location);
};

namespace exception {
Expand Down
5 changes: 3 additions & 2 deletions hilti/runtime/include/fmt.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,8 @@ std::string fmt(const char* fmt, const Args&... args) {

/** sprintf-style string formatting. */
template<typename... Args>
std::string fmt(const std::string& s, const Args&... args) {
return fmt(s.c_str(), args...);
std::string fmt(std::string_view s, const Args&... args) {
// In generated code `s` is always null-terminated.
return fmt(s.data(), args...);
}
} // namespace hilti::rt
20 changes: 10 additions & 10 deletions hilti/runtime/include/logging.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

#pragma once

#include <string>
#include <string_view>
#include <utility>

#include <hilti/rt/debug-logger.h>
Expand All @@ -18,10 +18,10 @@ namespace hilti::rt {
* anything that can happen during "normal" operation (which is almost
* everything).
*/
void fatalError(const std::string& msg) __attribute__((noreturn));
void fatalError(std::string_view msg) __attribute__((noreturn));

/** Reports a warning. */
void warning(const std::string& msg);
void warning(std::string_view msg);

/**
* Prints a string, or a runtime value, to a specific debug stream. This is a
Expand All @@ -42,39 +42,39 @@ namespace debug {

namespace detail {
/** Prints a debug message to a specific debug stream. */
inline void print(const std::string& stream, const char* msg) {
inline void print(std::string_view stream, const char* msg) {
if ( ::hilti::rt::detail::globalState()->debug_logger )
::hilti::rt::detail::globalState()->debug_logger->print(stream, msg);
}

/** Print a string to a specific debug stream with proper escaping. */
inline void print(const std::string& stream, const std::string_view& s) {
inline void print(std::string_view stream, std::string_view s) {
if ( ::hilti::rt::detail::globalState()->debug_logger )
::hilti::rt::detail::globalState()->debug_logger->print(stream, hilti::rt::escapeBytes(s, false));
}

template<typename T, typename std::enable_if_t<not std::is_convertible<T, std::string_view>::value>* = nullptr>
/** Prints the string representastion of a HILTI runtime value to a specific debug stream. */
inline void print(const std::string& stream, const T& t) {
inline void print(std::string_view stream, const T& t) {
if ( ::hilti::rt::detail::globalState()->debug_logger )
::hilti::rt::detail::globalState()->debug_logger->print(stream, hilti::rt::to_string_for_print(t));
}
} // namespace detail

/** Returns true if debug logging is enabled for a given stream. */
inline bool isEnabled(const std::string& stream) {
inline bool isEnabled(std::string_view stream) {
return ::hilti::rt::detail::globalState()->debug_logger &&
::hilti::rt::detail::globalState()->debug_logger->isEnabled(stream);
}

/** Increases the indentation level for a debug stream. */
inline void indent(const std::string& stream) {
inline void indent(std::string_view stream) {
if ( ::hilti::rt::detail::globalState()->debug_logger )
::hilti::rt::detail::globalState()->debug_logger->indent(stream);
}

/** Decreases the indentation level for a debug stream. */
inline void dedent(const std::string& stream) {
inline void dedent(const std::string_view stream) {
if ( ::hilti::rt::detail::globalState()->debug_logger )
::hilti::rt::detail::globalState()->debug_logger->dedent(stream);
}
Expand Down Expand Up @@ -103,7 +103,7 @@ inline void setLocation(const char* l = nullptr) { detail::tls_location = l; }
* arguments if nothing is going to get logged.
*/
template<typename T>
inline void print(const std::string& stream, T&& msg) {
inline void print(std::string_view stream, T&& msg) {
if ( ::hilti::rt::detail::globalState()->debug_logger &&
::hilti::rt::detail::globalState()->debug_logger->isEnabled(stream) )
::hilti::rt::debug::detail::print(stream, std::forward<T>(msg));
Expand Down
8 changes: 6 additions & 2 deletions hilti/runtime/include/types/bytes.h
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,9 @@ class Bytes : protected std::string {
/** Returns the bytes' data as a string instance. */
const std::string& str() const& { return *this; }

/** Returns the bytes' data as a string instance. */
std::string str() && { return std::move(*this); }

/** Returns an iterator representing the first byte of the instance. */
const_iterator begin() const { return const_iterator(0U, _control); }

Expand All @@ -248,8 +251,9 @@ class Bytes : protected std::string {
* @param n optional starting point, which must be inside the same instance
*/
const_iterator find(value_type b, const const_iterator& n = const_iterator()) const {
if ( auto i = Base::find(b, (n ? n - begin() : 0)); i != Base::npos )
return begin() + i;
auto beg = begin();
if ( auto i = Base::find(b, (n ? n - beg : 0)); i != Base::npos )
return beg + i;
else
return end();
}
Expand Down
11 changes: 6 additions & 5 deletions hilti/runtime/include/types/stream.h
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ class Chunk {
: _offset(o), _data(std::make_pair(n, d)) {}
Chunk(const Offset& o, Vector&& d) : _offset(o), _data(std::move(d)) {}
Chunk(const Offset& o, const View& d);
Chunk(const Offset& o, const std::string& s);
Chunk(const Offset& o, std::string_view sv);

template<int N>
Chunk(Offset o, std::array<Byte, N> d) : Chunk(_fromArray(o, std::move(d))) {}
Expand Down Expand Up @@ -1335,16 +1335,17 @@ class View final {
View extract(Byte* dst, uint64_t n) const {
_ensureValid();

auto p = unsafeBegin();

// Fast-path for when we're staying inside the initial chunk.
if ( auto chunk = _begin.chunk(); chunk && chunk->inRange(_begin.offset() + n) ) {
memcpy(dst, chunk->data(_begin.offset()), n);
return View(SafeConstIterator(_begin._chain, _begin.offset() + n, chunk), _end);
if ( auto chunk = p.chunk(); chunk && chunk->inRange(p.offset() + n) ) {
memcpy(dst, chunk->data(p.offset()), n);
return View(SafeConstIterator(p + n), _end);
}

if ( n > size() )
throw WouldBlock("end of stream view");

auto p = unsafeBegin();
for ( uint64_t i = 0; i < n; ++i )
dst[i] = *p++;

Expand Down
8 changes: 8 additions & 0 deletions hilti/runtime/include/types/string.h
Original file line number Diff line number Diff line change
Expand Up @@ -77,5 +77,13 @@ inline std::string detail::to_string_for_print<std::string_view>(const std::stri
return escapeUTF8(x, false, false, true);
}

// Specialization for string literals. Since `to_string_for_print` is not
// implemented with ADL like e.g., `to_string` provide an overload for string
// literals. This is needed since we cannot partially specialize
// `to_string_for_print`.
template<typename CharT, size_t N>
inline std::string to_string_for_print(const CharT (&x)[N]) {
return x;
}

} // namespace hilti::rt
2 changes: 1 addition & 1 deletion hilti/runtime/include/util.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@
namespace hilti::rt {

/** Reports an internal error and aborts execution. */
void internalError(const std::string& msg) __attribute__((noreturn));
void internalError(std::string_view msg) __attribute__((noreturn));

} // namespace hilti::rt

Expand Down
Loading

0 comments on commit 66984e2

Please sign in to comment.