Lexer accidentally(?) does not use is_ascii_whitespace for literal whitespace in string continuations #136600
Labels
A-parser
Area: The parsing of Rust source code to an AST
C-bug
Category: This is a bug.
T-compiler
Relevant to the compiler team, which will review and decide on the PR/issue.
T-lang
Relevant to the language team, which will review and decide on the PR/issue.
#108403 proposed to fix this, but it was claimed that the current behavior was documented in the reference in this comment. Incorrectly, as far as I can see, as that page only describes whitespace escapes as being \r, \t, and \n and the fix was about literal whitespace in string continuations. Now https://doc.rust-lang.org/reference/expressions/literal-expr.html#string-continuation-escapes does describe this behavior, but this was added later in Jan 2024. Indeed, this PR shows the reference documented skipping all whitespace, until Jun 13, 2022.
Current behavior has this ui test. It seems like this behavior was once implemented like it is now, then got claimed to be canon then got documented as canon. Anyway, I'm not sure why not all unicode whitespace is skipped, but just almost all ascii whitespace, but it seems important to pick an existing whitespace set, instead of using an old bad manual implementation of is_ascii_whitespace...
Perhaps we can see a crater run at least...
The text was updated successfully, but these errors were encountered: