-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Change lexer to treat 'e' after number as suffix unless it is followed by a valid exponent. #79912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change lexer to treat 'e' after number as suffix unless it is followed by a valid exponent. #79912
Conversation
Change lexer to treat 'e' after number as part of a suffix unless it is followed by a valid exponent.
r? @lcnr (rust-highfive has picked a reviewer for you, use r? to override) |
r? @petrochenkov maybe, they are more knowledgeable about this. this also requires |
It would be nice to move the exponent parsing into the parser, so all tokens in proc macros get a number followed by a suffix, but this would be a conflagration of concerns. This isn't really ready for review so I'll close and re-open once it's further along. TODOs for me
|
@derekdreery
That's pretty much the plan, see the discussion in #71322. |
Yeah, sorry about over promising on that front -- life happened. I am intending to pick that work eventually, no promises as to when this'll actually happen this time :) |
@petrochenkov @matklad cool I will have a look. It would probably be better to work from sone existing code, so I can get a feel for how to write stuff in line with the rest of the code. |
I'm going to write a little plan here for people to comment on:
EDIT TODO: think about pathological cases like Concern: How to handle |
With "fine-grained" tokens lexer will not produce float tokens at all, only integer tokens (possibly suffixed) and punctuation. |
I think we right now we can change behavior for cases returning For this
(Note that |
@petrochenkov @matklad is there any documentation on how you want the lexer/parser to look after the update? If there is, I could help work towards it. |
@derekdreery |
@petrochenkov Ok that's what I'll make this PR 😄 |
Closing due to inactivity. |
If I pick this up again I'll make a new PR. |
…ustc_session, r=<try> move some invalid exponent detection into rustc_session This PR moves part of the exponent checks from `rustc_lexer`/`rustc_parser` into `rustc_session`. This change does not affect which programs are accepted by the complier, or the diagnostics that are reported, with one main exception. That exception is that floats or ints with suffixes beginning with `e` are rejected *after* the token stream is passed to proc macros, rather than being rejected by the parser as was the case. This gives proc macro authors more consistent access to numeric literals: currently a proc macro could interpret `1m` or `30s` but not `7eggs` or `3em`. After this change all are handled the same. The lexer will still reject input if it contains `e` followed by a number, `+`/`-`, or `_` if they are not followed by a valid integer literal (number + `_`), but this doesn't affect macro authors who just want to access alpha suffixes. This PR is a continuation of rust-lang#79912. It is also solving exactly the same problem as [rust-lang#111628](rust-lang#111628). Exponents that contain arbitrarily long underscore suffixes are handled without read-ahead by tracking the exponent start in case of invalid exponent, so the suffix start is correct. This is very much an edge-case (the user would have to write something like `1e_______________23`) but nevertheless it is handled correctly. Also adds tests for various edge cases and improves diagnostics marginally. r: `@petrochenkov,` since they reviewed rust-lang#79912.
This fixes #67544. There will be some regression in diagnostics that need fixing before merge. I want to get feedback before I sink time into this though that the patch might be accepted.
It will fail CI because some compile-fail messages have changed, but I would still like feedback.
I expect if it does get merged, it will be after considerable discussion.
Also requires tests before merge.