diff --git a/src/expressions/literal-expr.md b/src/expressions/literal-expr.md
index 4eec37dcb..e5bc2dff4 100644
--- a/src/expressions/literal-expr.md
+++ b/src/expressions/literal-expr.md
@@ -8,11 +8,9 @@
> | [BYTE_LITERAL]\
> | [BYTE_STRING_LITERAL]\
> | [RAW_BYTE_STRING_LITERAL]\
-> | [INTEGER_LITERAL][^out-of-range]\
+> | [INTEGER_LITERAL]\
> | [FLOAT_LITERAL]\
> | `true` | `false`
->
-> [^out-of-range]: A value ≥ 2128 is not allowed.
A _literal expression_ is an expression consisting of a single token, rather than a sequence of tokens, that immediately and directly denotes the value it evaluates to, rather than referring to it by name or some other evaluation rule.
@@ -54,7 +52,7 @@ A string literal expression consists of a single [BYTE_STRING_LITERAL] or [RAW_B
An integer literal expression consists of a single [INTEGER_LITERAL] token.
-If the token has a [suffix], the suffix will be the name of one of the [primitive integer types][numeric types]: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `u128`, `i128`, `usize`, or `isize`, and the expression has that type.
+If the token has a [suffix], the suffix must be the name of one of the [primitive integer types][numeric types]: `u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `u128`, `i128`, `usize`, or `isize`, and the expression has that type.
If the token has no suffix, the expression's type is determined by type inference:
@@ -96,10 +94,12 @@ The value of the expression is determined from the string representation of the
* If the radix is not 10, the first two characters are removed from the string.
+* Any suffix is removed from the string.
+
* Any underscores are removed from the string.
* The string is converted to a `u128` value as if by [`u128::from_str_radix`] with the chosen radix.
-If the value does not fit in `u128`, the expression is rejected by the parser.
+If the value does not fit in `u128`, it is a compiler error.
* The `u128` value is converted to the expression's type via a [numeric cast].
@@ -111,9 +111,11 @@ If the value does not fit in `u128`, the expression is rejected by the parser.
## Floating-point literal expressions
-A floating-point literal expression consists of a single [FLOAT_LITERAL] token.
+A floating-point literal expression has one of two forms:
+ * a single [FLOAT_LITERAL] token
+ * a single [INTEGER_LITERAL] token which has a suffix and no radix indicator
-If the token has a [suffix], the suffix will be the name of one of the [primitive floating-point types][floating-point types]: `f32` or `f64`, and the expression has that type.
+If the token has a [suffix], the suffix must be the name of one of the [primitive floating-point types][floating-point types]: `f32` or `f64`, and the expression has that type.
If the token has no suffix, the expression's type is determined by type inference:
@@ -136,6 +138,8 @@ let x: f64 = 2.; // type f64
The value of the expression is determined from the string representation of the token as follows:
+* Any suffix is removed from the string.
+
* Any underscores are removed from the string.
* The string is converted to the expression's type as if by [`f32::from_str`] or [`f64::from_str`].
diff --git a/src/tokens.md b/src/tokens.md
index 8f9bcb1f7..0067b647d 100644
--- a/src/tokens.md
+++ b/src/tokens.md
@@ -72,13 +72,13 @@ Literals are tokens used in [literal expressions].
#### Numbers
-| [Number literals](#number-literals)`*` | Example | Exponentiation | Suffixes |
-|----------------------------------------|---------|----------------|----------|
-| Decimal integer | `98_222` | `N/A` | Integer suffixes |
-| Hex integer | `0xff` | `N/A` | Integer suffixes |
-| Octal integer | `0o77` | `N/A` | Integer suffixes |
-| Binary integer | `0b1111_0000` | `N/A` | Integer suffixes |
-| Floating-point | `123.0E+77` | `Optional` | Floating-point suffixes |
+| [Number literals](#number-literals)`*` | Example | Exponentiation |
+|----------------------------------------|---------|----------------|
+| Decimal integer | `98_222` | `N/A` |
+| Hex integer | `0xff` | `N/A` |
+| Octal integer | `0o77` | `N/A` |
+| Binary integer | `0b1111_0000` | `N/A` |
+| Floating-point | `123.0E+77` | `Optional` |
`*` All number literals allow `_` as a visual separator: `1_234.0E+18f64`
@@ -86,17 +86,26 @@ Literals are tokens used in [literal expressions].
A suffix is a sequence of characters following the primary part of a literal (without intervening whitespace), of the same form as a non-raw identifier or keyword.
-Any kind of literal (string, integer, etc) with any suffix is valid as a token,
-and can be passed to a macro without producing an error.
+
+> **Lexer**\
+> SUFFIX : IDENTIFIER_OR_KEYWORD\
+> SUFFIX_NO_E : SUFFIX _not beginning with `e` or `E`_
+
+Any kind of literal (string, integer, etc) with any suffix is valid as a token.
+
+A literal token with any suffix can be passed to a macro without producing an error.
The macro itself will decide how to interpret such a token and whether to produce an error or not.
+In particular, the `literal` fragment specifier for by-example macros matches literal tokens with arbitrary suffixes.
```rust
macro_rules! blackhole { ($tt:tt) => () }
+macro_rules! blackhole_lit { ($l:literal) => () }
blackhole!("string"suffix); // OK
+blackhole_lit!(1suffix); // OK
```
-However, suffixes on literal tokens parsed as Rust code are restricted.
+However, suffixes on literal tokens which are interpreted as literal expressions or patterns are restricted.
Any suffixes are rejected on non-numeric literal tokens,
and numeric literal tokens are accepted only with suffixes from the list below.
@@ -110,7 +119,7 @@ and numeric literal tokens are accepted only with suffixes from the list below.
> **Lexer**\
> CHAR_LITERAL :\
-> `'` ( ~\[`'` `\` \\n \\r \\t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) `'`
+> `'` ( ~\[`'` `\` \\n \\r \\t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) `'` SUFFIX?
>
> QUOTE_ESCAPE :\
> `\'` | `\"`
@@ -136,7 +145,7 @@ which must be _escaped_ by a preceding `U+005C` character (`\`).
> | ASCII_ESCAPE\
> | UNICODE_ESCAPE\
> | STRING_CONTINUE\
-> )\* `"`
+> )\* `"` SUFFIX?
>
> STRING_CONTINUE :\
> `\` _followed by_ \\n
@@ -196,7 +205,7 @@ following forms:
> **Lexer**\
> RAW_STRING_LITERAL :\
-> `r` RAW_STRING_CONTENT
+> `r` RAW_STRING_CONTENT SUFFIX?
>
> RAW_STRING_CONTENT :\
> `"` ( ~ _IsolatedCR_ )* (non-greedy) `"`\
@@ -233,7 +242,7 @@ r##"foo #"# bar"##; // foo #"# bar
> **Lexer**\
> BYTE_LITERAL :\
-> `b'` ( ASCII_FOR_CHAR | BYTE_ESCAPE ) `'`
+> `b'` ( ASCII_FOR_CHAR | BYTE_ESCAPE ) `'` SUFFIX?
>
> ASCII_FOR_CHAR :\
> _any ASCII (i.e. 0x00 to 0x7F), except_ `'`, `\`, \\n, \\r or \\t
@@ -253,7 +262,7 @@ _number literal_.
> **Lexer**\
> BYTE_STRING_LITERAL :\
-> `b"` ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )\* `"`
+> `b"` ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )\* `"` SUFFIX?
>
> ASCII_FOR_STRING :\
> _any ASCII (i.e 0x00 to 0x7F), except_ `"`, `\` _and IsolatedCR_
@@ -284,7 +293,7 @@ following forms:
> **Lexer**\
> RAW_BYTE_STRING_LITERAL :\
-> `br` RAW_BYTE_STRING_CONTENT
+> `br` RAW_BYTE_STRING_CONTENT SUFFIX?
>
> RAW_BYTE_STRING_CONTENT :\
> `"` ASCII* (non-greedy) `"`\
@@ -329,7 +338,7 @@ literal_. The grammar for recognizing the two kinds of literals is mixed.
> **Lexer**\
> INTEGER_LITERAL :\
> ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL )
-> INTEGER_SUFFIX?
+> SUFFIX_NO_E?
>
> DEC_LITERAL :\
> DEC_DIGIT (DEC_DIGIT|`_`)\*
@@ -350,10 +359,6 @@ literal_. The grammar for recognizing the two kinds of literals is mixed.
> DEC_DIGIT : \[`0`-`9`]
>
> HEX_DIGIT : \[`0`-`9` `a`-`f` `A`-`F`]
->
-> INTEGER_SUFFIX :\
-> `u8` | `u16` | `u32` | `u64` | `u128` | `usize`\
-> | `i8` | `i16` | `i32` | `i64` | `i128` | `isize`
An _integer literal_ has one of four forms:
@@ -369,11 +374,11 @@ An _integer literal_ has one of four forms:
(`0b`) and continues as any mixture (with at least one digit) of binary digits
and underscores.
-Like any literal, an integer literal may be followed (immediately, without any spaces) by an _integer suffix_, which must be the name of one of the [primitive integer types][numeric types]:
-`u8`, `i8`, `u16`, `i16`, `u32`, `i32`, `u64`, `i64`, `u128`, `i128`, `usize`, or `isize`.
+Like any literal, an integer literal may be followed (immediately, without any spaces) by a suffix as described above.
+The suffix may not begin with `e` or `E`, as that would be interpreted as the exponent of a floating-point literal.
See [literal expressions] for the effect of these suffixes.
-Examples of integer literals of various forms:
+Examples of integer literals which are accepted as literal expressions:
```rust
# #![allow(overflowing_literals)]
@@ -396,27 +401,27 @@ Examples of integer literals of various forms:
0usize;
-// These are too big for their type, but are still valid tokens
-
+// These are too big for their type, but are accepted as literal expressions.
128_i8;
256_u8;
+// This is an integer literal, accepted as a floating-point literal expression.
+5f32;
```
Note that `-1i8`, for example, is analyzed as two tokens: `-` followed by `1i8`.
-Examples of invalid integer literals:
-```rust,compile_fail
-// uses numbers of the wrong base
+Examples of integer literals which are not accepted as literal expressions:
-0b0102;
-0o0581;
-
-// bin, hex, and octal literals must have at least one digit
-
-0b_;
-0b____;
+```rust
+# #[cfg(FALSE)] {
+0invalidSuffix;
+123AFB43;
+0b010a;
+0xAB_CD_EF_GH;
+0b1111_f32;
+# }
```
#### Tuple index
@@ -442,9 +447,8 @@ let cat = example.01; // ERROR no field named `01`
let horse = example.0b10; // ERROR no field named `0b10`
```
-> **Note**: The tuple index may include an `INTEGER_SUFFIX`, but this is not
-> intended to be valid, and may be removed in a future version. See
-> for more information.
+> **Note**: Tuple indices may include certain suffixes, but this is not intended to be valid, and may be removed in a future version.
+> See for more information.
#### Floating-point literals
@@ -452,38 +456,32 @@ let horse = example.0b10; // ERROR no field named `0b10`
> FLOAT_LITERAL :\
> DEC_LITERAL `.`
> _(not immediately followed by `.`, `_` or an XID_Start character)_\
-> | DEC_LITERAL FLOAT_EXPONENT\
-> | DEC_LITERAL `.` DEC_LITERAL FLOAT_EXPONENT?\
-> | DEC_LITERAL (`.` DEC_LITERAL)?
-> FLOAT_EXPONENT? FLOAT_SUFFIX
+> | DEC_LITERAL `.` DEC_LITERAL SUFFIX_NO_E?\
+> | DEC_LITERAL (`.` DEC_LITERAL)? FLOAT_EXPONENT SUFFIX?\
>
> FLOAT_EXPONENT :\
> (`e`|`E`) (`+`|`-`)?
> (DEC_DIGIT|`_`)\* DEC_DIGIT (DEC_DIGIT|`_`)\*
>
-> FLOAT_SUFFIX :\
-> `f32` | `f64`
-A _floating-point literal_ has one of three forms:
+A _floating-point literal_ has one of two forms:
* A _decimal literal_ followed by a period character `U+002E` (`.`). This is
optionally followed by another decimal literal, with an optional _exponent_.
* A single _decimal literal_ followed by an _exponent_.
-* A single _decimal literal_ (in which case a suffix is required).
Like integer literals, a floating-point literal may be followed by a
suffix, so long as the pre-suffix part does not end with `U+002E` (`.`).
-There are two valid _floating-point suffixes_: `f32` and `f64` (the names of the 32-bit and 64-bit [primitive floating-point types][floating-point types]).
+The suffix may not begin with `e` or `E` if the literal does not include an exponent.
See [literal expressions] for the effect of these suffixes.
-Examples of floating-point literals of various forms:
+Examples of floating-point literals which are accepted as literal expressions:
```rust
123.0f64;
0.1f64;
0.1f32;
12E+99_f64;
-5f32;
let x: f64 = 2.;
```
@@ -493,39 +491,16 @@ to call a method named `f64` on `2`.
Note that `-1.0`, for example, is analyzed as two tokens: `-` followed by `1.0`.
-#### Number pseudoliterals
-
-> **Lexer**\
-> NUMBER_PSEUDOLITERAL :\
-> DEC_LITERAL ( . DEC_LITERAL )? FLOAT_EXPONENT\
-> ( NUMBER_PSEUDOLITERAL_SUFFIX | INTEGER_SUFFIX )\
-> | DEC_LITERAL . DEC_LITERAL\
-> ( NUMBER_PSEUDOLITERAL_SUFFIX_NO_E | INTEGER SUFFIX )\
-> | DEC_LITERAL NUMBER_PSEUDOLITERAL_SUFFIX_NO_E\
-> | ( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL )\
-> ( NUMBER_PSEUDOLITERAL_SUFFIX_NO_E | FLOAT_SUFFIX )
->
-> NUMBER_PSEUDOLITERAL_SUFFIX :\
-> IDENTIFIER_OR_KEYWORD _not matching INTEGER_SUFFIX or FLOAT_SUFFIX_
->
-> NUMBER_PSEUDOLITERAL_SUFFIX_NO_E :\
-> NUMBER_PSEUDOLITERAL_SUFFIX _not beginning with `e` or `E`_
-
-Tokenization of numeric literals allows arbitrary suffixes as described in the grammar above.
-These values generate valid tokens, but are not valid [literal expressions], so are usually an error except as macro arguments.
+Examples of floating-point literals which are not accepted as literal expressions:
-Examples of such tokens:
-```rust,compile_fail
-0invalidSuffix;
-123AFB43;
-0b010a;
-0xAB_CD_EF_GH;
+```rust
+# #[cfg(FALSE)] {
2.0f80;
2e5f80;
2e5e6;
2.0e5e6;
1.3e10u64;
-0b1111_f32;
+# }
```
#### Reserved forms similar to number literals
@@ -536,7 +511,7 @@ Examples of such tokens:
> | OCT_LITERAL \[`8`-`9`​]\
> | ( BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) `.` \
> _(not immediately followed by `.`, `_` or an XID_Start character)_\
-> | ( BIN_LITERAL | OCT_LITERAL ) `e`\
+> | ( BIN_LITERAL | OCT_LITERAL ) (`e`|`E`)\
> | `0b` `_`\* _end of input or not BIN_DIGIT_\
> | `0o` `_`\* _end of input or not OCT_DIGIT_\
> | `0x` `_`\* _end of input or not HEX_DIGIT_\
@@ -549,7 +524,7 @@ Due to the possible ambiguity these raise, they are rejected by the tokenizer in
* An unsuffixed binary, octal, or hexadecimal literal followed, without intervening whitespace, by a period character (with the same restrictions on what follows the period as for floating-point literals).
-* An unsuffixed binary or octal literal followed, without intervening whitespace, by the character `e`.
+* An unsuffixed binary or octal literal followed, without intervening whitespace, by the character `e` or `E`.
* Input which begins with one of the radix prefixes but is not a valid binary, octal, or hexadecimal literal (because it contains no digits).
@@ -561,13 +536,13 @@ Examples of reserved forms:
0b0102; // this is not `0b010` followed by `2`
0o1279; // this is not `0o127` followed by `9`
0x80.0; // this is not `0x80` followed by `.` and `0`
-0b101e; // this is not a pseudoliteral, or `0b101` followed by `e`
-0b; // this is not a pseudoliteral, or `0` followed by `b`
-0b_; // this is not a pseudoliteral, or `0` followed by `b_`
-2e; // this is not a pseudoliteral, or `2` followed by `e`
-2.0e; // this is not a pseudoliteral, or `2.0` followed by `e`
-2em; // this is not a pseudoliteral, or `2` followed by `em`
-2.0em; // this is not a pseudoliteral, or `2.0` followed by `em`
+0b101e; // this is not a suffixed literal, or `0b101` followed by `e`
+0b; // this is not an integer literal, or `0` followed by `b`
+0b_; // this is not an integer literal, or `0` followed by `b_`
+2e; // this is not a floating-point literal, or `2` followed by `e`
+2.0e; // this is not a floating-point literal, or `2.0` followed by `e`
+2em; // this is not a suffixed literal, or `2` followed by `em`
+2.0em; // this is not a suffixed literal, or `2.0` followed by `em`
```
## Lifetimes and loop labels