Skip to content
This repository was archived by the owner on Feb 18, 2025. It is now read-only.

Commit b65a165

Browse files
committed
Merge pull request #34 from bergus/patch-1
Escape hex characters in order to preserve unicode literals.
2 parents e3f921c + a4db99f commit b65a165

File tree

1 file changed

+8
-7
lines changed

1 file changed

+8
-7
lines changed

EscapedChars.md

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,20 +9,20 @@ in the ES2015 specification. The characters included in the list are the followi
99

1010
|Character | Why escape it?
1111
|-----------|--------------|
12-
| `^` | So that `new RegExp(RegExp.escape('^') + "a")` will match `"^a"` rather than the `^` being treated as a negation or start of sentencecontrol construct. |
12+
| `^` | So that `new RegExp(RegExp.escape('^') + "a")` will match `"^a"` rather than the `^` being treated as a negation or start of sentence control construct. |
1313
| `$` | So that `new RegExp("a" + RegExp.escape('$'))` will match `"a$"` rather than the `$` being treated as a end of sentence control construct. |
1414
| `\` | So that `new RegExp(RegExp.escape("\\"))` won't throw a type error and instead match `"\\"`, and more generally that `\` won't be treated as an escape control construct. |
1515
| `.` | So that `new RegExp(RegExp.escape("."))` won't be matched against single characters like `"a"` but instead against an actual dot ("."), and more generally that `.` won't be treated as an "any character" control construct. |
1616
| `*` | So that `new RegExp(RegExp.escape("*"))` won't throw a type error but instead match against an actual star ("*"), and more generally that `*` won't be treated as a "zero or more times" quantifier. |
1717
| `+` | So that `new RegExp(RegExp.escape("+"))` won't throw a type error but instead match against an actual plus sign ("+"), and more generally that `+` won't be treated as a "one or more times" quantifier. |
18-
| `?` | So that `new RegExp(RegExp.escape("?"))` won't throw a type error but instead match against an actual question mark sign ("?"), and more generally that `?` won't be treated as a "once or not at all" quantifier. |
18+
| `?` | So that `new RegExp(RegExp.escape("?"))` won't throw a type error but instead match against an actual question mark sign ("?"), and more generally that `?` won't be treated as a "once or not at all" quantifier. Also that `new RegExp("("+RegExp.escape("?=")+")")` will match a literal question mark followed by an equals sign, instead of introducint a lookahead, and more generally that `?` won't make groups become assertions or non-capturing. |
1919
| `(` | So that `new RegExp(RegExp.escape("("))` won't throw a type error but instead match against an actual opening parenthesis ("("), and more generally that `(` won't be treated as a "start of a capturing group" logical operator. |
2020
| `)` | So that `new RegExp(RegExp.escape(")"))` won't throw a type error but instead match against an actual closing parenthesis (")"), and more generally that `(` won't be treated as a "end of a capturing group" logical operator. |
2121
| `[` | So that `new RegExp(RegExp.escape("["))` won't throw a type error but instead match against an actual opening bracket ("["), and more generally that `[` won't be treated as a "start of a character class" construct. |
2222
| `]` | This construct is needed to allow escaping inside character classes. `new RegExp("]")` is perfectly valid but we want to allow `new RegExp("["+RegExp.escape("]...")+"]")` in which the `]` needs to be taken literally (and not as the closing "end of character class" character. |
2323
| `{` | So that `new RegExp("a" + RegExp.escape("{1,2}"))` will not match `"aaa"`, and more generally that `{` is taken literally and not as a quantifier. |
2424
| `}` | So that `new RegExp("a" + RegExp.escape("{1,2}"))` will not match `"aaa"`, and more generally that `}` is taken literally and not as a quantifier. |
25-
| `|` | So that `|` will be treated literally and `new RegExp(Regxp.escape("a|b"))` will produce a string that matches `"a|b"` instead of the | being treated as a logical "or" operator. |
25+
| `|` | So that `|` will be treated literally and `new RegExp(Regxp.escape("a|b"))` will produce a string that matches `"a|b"` instead of the `|` being treated as the alternative operator. |
2626

2727

2828
### "Safe with extra escape set" Proposal.
@@ -31,21 +31,22 @@ This proposal additionally escapes `-` for context sensitive inside-character-cl
3131

3232
|Character | Why escape it?
3333
|-----------|--------------|
34-
| `-` | This construct is needed to allow escaping inside character classes. `new RegExp("-")` is perfectly valid but we want to allow `new RegExp("[a"+RegExp.escape("-")+"b]")` in which the `-` needs to be taken literally (and not as a character range character. |
34+
| `-` | This construct is needed to allow escaping inside character classes. `new RegExp("-")` is perfectly valid but we want to allow `new RegExp("[a"+RegExp.escape("-")+"b]")` in which the `-` needs to be taken literally (and not as a character range character). |
3535

3636
And __only at the start__ of strings:
3737

3838
|Character | Why escape it?
3939
|-----------|--------------|
4040
| `0-9` | So that in `new RegExp("(foo)\\1" + RegExp.escape(1))` the back reference will still treat the first group and not the 11th and the `1` will be taken literally - see [this issue](https://github.com/benjamingr/RegExp.escape/issues/17) for more details. |
41+
| `0-9a-fA-F` | So that `new RegExp("\\u41" + RegExp.escape("B"))` will not match the letter "Л" (`\u041B`) but rather the sequence "AB", or more generally that a leading hexadecimal character may not continue a preceding escape sequence - see [this issue](https://github.com/benjamingr/RegExp.escape/issues/29) for more details. |
4142

42-
Note that if we ever introduce named capturing groups to a subclass of the default `RegExp` those would also need to escape those characters. On escaping hex digits see https://github.com/benjamingr/RegExp.escape/issues/29 .
43+
Note that if we ever introduce named capturing groups to a subclass of the default `RegExp` those would also need to escape those characters.
4344

4445
### Extended "Safe" Proposal
4546

4647
This proposal escapes a maximal set of characters and ensures compatibility with edge cases like passing the result to `eval`.
4748

4849
|Character | Why escape it?
4950
|-----------|--------------|
50-
| `/` | So that `eval("/"+RegExp.espace("/")+"/")` will produce as a valid regular expression. More generally so that regular expressions will be passable to `eval` if sent from elsewhere with `/`. Note that [data](https://github.com/benjamingr/RegExp.escape/tree/master/data) indicates this is not a common use case. |
51-
| [`WhiteSpace`](http://www.ecma-international.org/ecma-262/6.0/index.html#table-32) | So that `eval("/"+RegExp.espace("/")+"/")` will produce as a valid regular expression. More generally so that regular expressions will be passable to `eval` if sent from elsewhere with `/`. Note that [data](https://github.com/benjamingr/RegExp.escape/tree/master/data) indicates this is not a common use case. |
51+
| `/` | So that `eval("/"+RegExp.espace("/")+"/")` will produce a valid regular expression. More generally so that regular expressions will be passable to `eval` if sent from elsewhere with `/`. Note that [data](https://github.com/benjamingr/RegExp.escape/tree/master/data) indicates this is not a common use case. |
52+
| [`WhiteSpace`](http://www.ecma-international.org/ecma-262/6.0/index.html#table-32) | So that `eval("/"+RegExp.espace("\r\n")+"/")` will produce a valid regular expression. More generally so that regular expressions will be passable to `eval` if sent from elsewhere with `/`. Also improves readability of the `escape` output. See [this issue](https://github.com/benjamingr/RegExp.escape/issues/30) for more details. Note that [data](https://github.com/benjamingr/RegExp.escape/tree/master/data) indicates this is not a common use case. |

0 commit comments

Comments
 (0)