Micro-optimizations: cache active block parser, ASCII fast path in isLetter#1120
Open
GromNaN wants to merge 2 commits into
Open
Micro-optimizations: cache active block parser, ASCII fast path in isLetter#1120GromNaN wants to merge 2 commits into
GromNaN wants to merge 2 commits into
Conversation
…n isLetter
Cache the active block parser to avoid calling end() on the parsers array
on every getActiveBlockParser() call. end() costs ~67ns; a direct property
read costs ~28ns. getActiveBlockParser() is called ~5x per line so this
compounds quickly. The cache is kept in sync in activateBlockParser() and
deactivateBlockParser().
Add an ASCII fast path to RegexHelper::isLetter(). The previous
implementation called preg_match('/[\pL]/u', $char) on every non-blank,
non-indented line to detect whether to skip block-start parsing. For
single-byte ASCII characters (the vast majority in Markdown), a direct
range comparison is ~60% faster than the regex.
Micro-benchmarks (PHP 8.5.2, OPcache on, Xdebug off):
end($parsers) 67 ns → property read 28 ns (-58%)
preg_match Unicode 45 ns → char range 18 ns (-60%)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
getActiveBlockParser()calledend($this->activeBlockParsers)on every call.end()costs ~67 ns; a direct property read costs ~28 ns. The method is called ~5× per line so the overhead adds up. The active parser is now cached in$this->activeBlockParser, kept in sync inactivateBlockParser()anddeactivateBlockParser().RegexHelper::isLetter()calledpreg_match('/[pL]/u', $char)on every non-blank non-indented line to detect whether to skip block-start parsing. For single-byte ASCII characters — the vast majority in Markdown —ctype_alpha()costs ~12 ns vs ~45 ns for the unicode regex.Micro-benchmarks (PHP 8.5.2, OPcache on,
XDEBUG_MODE=off):end($parsers)per callpreg_match Unicodeper callEnd-to-end on
sample.md(27 KB) with 200 iterations, converter initialized once: