-
Notifications
You must be signed in to change notification settings - Fork 12
align \s with White_Space #37
Comments
I would personally be okay with this since the |
I think trim() should trim White_Space, and I think that developers would expect that. I understand that U+FEFF is in White_Space probably as a BOM shortcut for the lexer, but I don't understand why outside of the lexer we should have a 4% discrepancy between trim()/ |
Concern from @bmeck : There are special uses of U+FEFF that some people may handle with |
Note: the preferred character — since Unicode 3.2 (2002) — to join words is U+2060 WORD JOINER. It behaves the same as U+FEFF in line break, word break, and grapheme cluster break, but does not have the "BOM" semantics. See |
Discussion: The regex flag only affects regular expressions, therefore we cannot / will not change the behavior of String.trim(). |
This is no longer a goal for this proposal. |
FWIW, jmespath-community/jmespath.test#11 (comment) uncovered real-world divergence between independent implementations of a specification that in the case of JavaScript was caused by the difference between regular expression |
ES regex
\s
is almost, but not quite, the same as\p{White_Space}
.\s
(CharacterClassEscape :: s) is defined as “Return the CharSet containing all characters corresponding to a code point on the right-hand side of the WhiteSpace or LineTerminator productions.”On the other hand,
\p{White_Space}
is the Unicode White_Space property. See the list of code point ranges at the top of https://www.unicode.org/Public/UCD/latest/ucd/PropList.txtThese are deceptively similar; each set contains 25 code points. However,
\s
contains U+FEFF ZERO WIDTH NO-BREAK SPACE, which is a format control (gc=Cf), not a space character\p{White_Space}
contains U+0085 NEXT LINE (NEL) which is missing from the ES LineTerminator list.This is confusing and non-standard.
Under the new flag, we should change
\s
to be the same as\p{White_Space}
.I am not proposing that we change the ES lexer's WhiteSpace or LineTerminator definitions.
The text was updated successfully, but these errors were encountered: