Skip to content

Commit 4d1f046

Browse files
committed
feat(reference): Lexical Structure—Literals
1 parent 94b7788 commit 4d1f046

File tree

2 files changed

+354
-1
lines changed

2 files changed

+354
-1
lines changed

docs_config.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ docs_groups:
9999
- reference/lexical_structure/comments
100100
- reference/lexical_structure/keywords
101101
- reference/lexical_structure/identifiers
102-
# - reference/lexical_structure/literals
102+
- reference/lexical_structure/literals
103103
# - reference/lexical_structure/operators
104104
# - reference/lexical_structure/delimiters
105105
# - constructs/bindings
Lines changed: 353 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,353 @@
1+
---
2+
title: Literals
3+
---
4+
5+
Literals are single tokens that represent constant values of built-in Grain types.
6+
7+
## Strings
8+
9+
```ebnf
10+
STRING = '"' character* '"' ;
11+
12+
character =
13+
| ascii_escape
14+
| unicode_escape
15+
| unicode_character ;
16+
17+
ascii_escape =
18+
| "\" oct_digit oct_digit? oct_digit?
19+
| "\x" hex_digit hex_digit?
20+
| "\" ['b' 'f' 'n' 'r' 't' 'v' '"' '\'] ;
21+
22+
unicode_escape =
23+
| "\u" hex_digit hex_digit hex_digit hex_digit
24+
| "\u{" hex_digit hex_digit? hex_digit? hex_digit? hex_digit? hex_digit? "}" ;
25+
26+
unicode_character = <any valid unicode character> ;
27+
28+
oct_digit = ['0'-'7'] ;
29+
hex_digit = ['0'-'9' 'A'-'F' 'a'-'f'] ;
30+
```
31+
32+
A string is a sequence of characters and/or escape sequences surrounded by double quotation marks.
33+
34+
Ascii escapes can be written in octal, as a backslash (`\`) followed by 1-3 octal digits. They can also be written in hexadecimal, as a blackslash and `x` followed by 1-2 hexadecimal digits. There are also a number of character-specific escapes available.
35+
36+
| escape | value |
37+
| ------ | ------------------------------------- |
38+
| `\123` | An octal escape with 1-3 octal digits |
39+
| `\x4f` | A hex escape with 1-2 hex digits |
40+
| `\b` | Backspace |
41+
| `\f` | Form feed |
42+
| `\n` | Line feed (newline) |
43+
| `\r` | Carriage return |
44+
| `\t` | Tab |
45+
| `\v` | Vertical tab |
46+
| `\"` | Double quote |
47+
| `\\` | Backslash |
48+
49+
Unicode escapes can be written in a fixed-width form with exactly four hexadecimal digits, or a variable form in brackets with 1-6 digits.
50+
51+
| escape | value |
52+
| ----------- | ----------------------------------------------------- |
53+
| `\u200D` | A unicode escape with exactly four hexadecimal digits |
54+
| `\u{1F926}` | A unicode escape with 1-6 hexadecimal digits |
55+
56+
Additionally, if a line in a string ends in backslash `\` character followed by a newline, the newline and backslash are ignored. This allows very long strings be be split over multiple lines:
57+
58+
```grain
59+
"The quick \
60+
brown fox \
61+
jumps over the \
62+
lazy dog."
63+
```
64+
65+
### Examples
66+
67+
```grain
68+
"Hello, world!"
69+
```
70+
71+
```grain
72+
"inner \"quote\""
73+
```
74+
75+
```grain
76+
"with unicode 💯🔥🌾"
77+
```
78+
79+
```grain
80+
"\u{1F926}\u{1F3FC}\u{200D}\u{2642}\u{FE0F}" // 🤦🏼‍♂️
81+
```
82+
83+
## Chars
84+
85+
```ebnf
86+
CHAR = "'" character "'" ;
87+
88+
character =
89+
| ascii_escape
90+
| unicode_escape
91+
| unicode_character ;
92+
93+
ascii_escape =
94+
| "\" oct_digit oct_digit? oct_digit?
95+
| "\x" hex_digit hex_digit?
96+
| "\" ['b' 'f' 'n' 'r' 't' 'v' "'" '\'] ;
97+
98+
unicode_escape =
99+
| "\u" hex_digit hex_digit hex_digit hex_digit
100+
| "\u{" hex_digit hex_digit? hex_digit? hex_digit? hex_digit? hex_digit? "}" ;
101+
102+
unicode_character = <any valid unicode character> ;
103+
104+
oct_digit = ['0'-'7'] ;
105+
hex_digit = ['0'-'9' 'A'-'F' 'a'-'f'] ;
106+
```
107+
108+
A char is a single character or escape sequence surrounded by single quotation marks.
109+
110+
Ascii escapes can be written in octal, as a backslash (`\`) followed by 1-3 octal digits. They can also be written in hexadecimal, as a blackslash and `x` followed by 1-2 hexadecimal digits. There are also a number of character-specific escapes available.
111+
112+
| escape | value |
113+
| ------ | ------------------------------------- |
114+
| `\123` | An octal escape with 1-3 octal digits |
115+
| `\x4f` | A hex escape with 1-2 hex digits |
116+
| `\b` | Backspace |
117+
| `\f` | Form feed |
118+
| `\n` | Line feed (newline) |
119+
| `\r` | Carriage return |
120+
| `\t` | Tab |
121+
| `\v` | Vertical tab |
122+
| `\'` | Single quote |
123+
| `\\` | Backslash |
124+
125+
Unicode escapes can be written in a fixed-width form with exactly four hexadecimal digits, or a variable form in brackets with 1-6 digits.
126+
127+
| escape | value |
128+
| ----------- | ----------------------------------------------------- |
129+
| `\u200D` | A unicode escape with exactly four hexadecimal digits |
130+
| `\u{1F926}` | A unicode escape with 1-6 hexadecimal digits |
131+
132+
### Examples
133+
134+
```grain
135+
'H'
136+
```
137+
138+
```grain
139+
'\''
140+
```
141+
142+
```grain
143+
'🌾'
144+
```
145+
146+
```grain
147+
'\u{1F926}'
148+
```
149+
150+
## Numerics
151+
152+
The lexer recognizes numerics of the `Number`, `Int32`, `Int64`, `Float32`, `Float64`, `WasmI32`, `WasmI64`, `WasmF32`, and `WasmF64` types.
153+
154+
### Integers
155+
156+
```ebnf
157+
INTEGER = "-"? (binary | octal | decimal | hexadecimal) int_suffix? ;
158+
159+
decimal = dec_digit (dec_digit | "_")* ;
160+
binary = "0" ['b' 'B'] bin_digit (bin_digit | "_")* ;
161+
octal = "0" ['o' 'O'] oct_digit (oct_digit | "_")* ;
162+
hexadecimal = "0" ['x' 'X'] hex_digit (hex_digit | "_")* ;
163+
164+
dec_digit = ['0'-'9'] ;
165+
bin_digit = ['0'-'1'] ;
166+
oct_digit = ['0'-'7'] ;
167+
hex_digit = ['0'-'9' 'A'-'F' 'a'-'f'] ;
168+
169+
int_suffix = ['l' 'L' 'n' 'N'] ;
170+
```
171+
172+
Integers can be written in decimal (base 10), binary (base 2), octal (base 8), or hexadecimal (base 16). An integer written without a prefix is considered to be a decimal integer, while an `0b` or `0B` prefix denotes a binary integer, `0o` or `0O` denotes an octal integer, and `0x` or `0X` denotes a hexadecimal integer. The numeric part of the integer must start with a digit, but underscores are allowed to appear throughout the numeric portion to help with the readability of the integer and don't affect the value of the integer.
173+
174+
A suffix (or lack of suffix) denotes the Grain type of the integer:
175+
176+
| Suffix | Type |
177+
| ------ | --------- |
178+
| none | `Number` |
179+
| `l` | `Int32` |
180+
| `L` | `Int64` |
181+
| `n` | `WasmI32` |
182+
| `N` | `WasmI64` |
183+
184+
#### Examples
185+
186+
```grain
187+
42
188+
```
189+
190+
```grain
191+
-5
192+
```
193+
194+
```grain
195+
// one billion
196+
1_000_000_000
197+
```
198+
199+
```grain
200+
// 42 in hexadecimal
201+
0x2A // or 0x2a, 0X2A, 0X2a
202+
```
203+
204+
```grain
205+
// 42 in octal
206+
0o52
207+
```
208+
209+
```grain
210+
// 42 in binary
211+
0b101010
212+
```
213+
214+
```grain
215+
// 0xDEC0DE in binary
216+
0b1101_1110_1100_0000_1101_1110
217+
```
218+
219+
```grain
220+
// Int64 literal
221+
65L
222+
```
223+
224+
```grain
225+
// WasmI32 literal
226+
987n
227+
```
228+
229+
### Floats
230+
231+
```ebnf
232+
FLOAT =
233+
| "-"? dec_digit (dec_digit | "_")* "." (dec_digit | "_")* exponent? float_suffix?
234+
| "-"? dec_digit (dec_digit | "_")* exponent float_suffix?
235+
| "-"? "." dec_digit (dec_digit | "_")* exponent? float_suffix? ;
236+
237+
exponent = ['e' 'E'] ['+' '-']? dec_digit (dec_digit | '_')* ;
238+
239+
dec_digit = ['0'-'9'] ;
240+
241+
float_suffix = ['f' 'd' 'w' 'W'] ;
242+
```
243+
244+
A float is distinguished from an integer by the existence of a decimal point or exponent. The numeric parts of a float must start with a digit, but underscores are allowed to appear throughout the numeric portions to help with the readability of the float and don't affect its value.
245+
246+
A suffix (or lack of suffix) denotes the Grain type of the float:
247+
248+
| Suffix | Type |
249+
| ------ | --------- |
250+
| none | `Number` |
251+
| `f` | `Float32` |
252+
| `d` | `Float64` |
253+
| `w` | `WasmF32` |
254+
| `W` | `WasmF64` |
255+
256+
#### Examples
257+
258+
```grain
259+
1.23 // a floating-point number
260+
```
261+
262+
In the scientific notation form, the significand (the part of a floating-point number that contains the significant digits) is multiplied by 10 raised to the power of the provided exponent. For example,
263+
264+
```grain
265+
1.23e3 // computes as 1.23x10^3, for a value of 1230.
266+
```
267+
268+
The sign of the exponent can also be provided:
269+
270+
```grain
271+
1.23e+3 // 1230
272+
```
273+
274+
```grain
275+
1.23e-3 // 0.00123
276+
```
277+
278+
As with integers, underscores can be placed throughout the number to make it more readable:
279+
280+
```grain
281+
1_000.555_5e2
282+
```
283+
284+
### Rationals
285+
286+
```ebnf
287+
RATIONAL = integer '/' integer ;
288+
289+
integer = "-"? (binary | octal | decimal | hexadecimal) ;
290+
291+
decimal = dec_digit (dec_digit | "_")* ;
292+
binary = "0" ['b' 'B'] bin_digit (bin_digit | "_")* ;
293+
octal = "0" ['o' 'O'] oct_digit (oct_digit | "_")* ;
294+
hexadecimal = "0" ['x' 'X'] hex_digit (hex_digit | "_")* ;
295+
296+
dec_digit = ['0'-'9'] ;
297+
bin_digit = ['0'-'1'] ;
298+
oct_digit = ['0'-'7'] ;
299+
hex_digit = ['0'-'9' 'A'-'F' 'a'-'f'] ;
300+
```
301+
302+
A rational literal is an integer numerator and an integer denominator separated by a slash (`/`). Like integers and floats, underscores may be placed throughout.
303+
304+
#### Examples
305+
306+
```grain
307+
1/3
308+
```
309+
310+
```grain
311+
-5/7
312+
```
313+
314+
```grain
315+
14/-0xf
316+
```
317+
318+
```grain
319+
10/1_000_000_000
320+
```
321+
322+
## Booleans
323+
324+
Boolean literals consist of the values `true` and `false`.
325+
326+
```ebnf
327+
TRUE = "true" ;
328+
FALSE = "false" ;
329+
```
330+
331+
### Examples
332+
333+
```grain
334+
true
335+
```
336+
337+
```grain
338+
false
339+
```
340+
341+
## Void
342+
343+
The `void` literal is the only value of the `Void` type.
344+
345+
```ebnf
346+
VOID = "void" ;
347+
```
348+
349+
### Examples
350+
351+
```grain
352+
void
353+
```

0 commit comments

Comments
 (0)