-
Notifications
You must be signed in to change notification settings - Fork 0
Groups
- Overview
- Grouping elements for quantifiers
- Remembering parts of the match
- Nesting groups
- Non-capturing groups
- Named groups
Groups are used in a regex for one (or both) of two purposes:
- To group a number of elements together so a quantifier can be applied to the whole group.
- To "remember" part of the text matched by the regex so we can extract it later by indexing into the object returned by
RegExp.exec()
.
Group quantifiers are pretty self-explanatory. A quantifier passed to group()
will apply to the whole group. For example, this is a very simple regex to match a normal sentence:
const regex = new RegexBuilder()
.group(r => r
.wordCharacter(RegexQuantifier.oneOrMore)
.whitespace(RegexQuantifier.oneOrMore),
RegexQuantifier.oneOrMore
)
.wordCharacter(RegexQuantifier.oneOrMore)
.text(".")
.buildRegex();
Say we want to match a person's name (two consecutive words each beginning with a capital letter) and then greet them by their first name, we could build a regex like this:
const regex = new RegexBuilder()
.wordBoundary()
.group(r => r
.uppercaseLetter()
.lowercaseLetter(RegexQuantifier.oneOrMore)
)
.whitespace()
.uppercaseLetter()
.lowercaseLetter(RegexQuantifier.oneOrMore)
.wordBoundary()
.buildRegex();
We can then extract the first name from a successful match like this:
const match = regex.exec(inputString);
const firstName = match[1];
Note that match
is indexed from 1, not 0. For reasons documented elsewhere, match[0]
will return the whole matched string.
As with raw regexes, RegexBuilder
allows you to nest groups to arbitrary depth. If you use capturing groups, match[1]
will refer to the first started group, and so on. For example:
const regex = new RegexBuilder()
.wordBoundary()
.group(r1 => r1 // start of group 1
.group(r2 => r2 // start of group 2
.uppercaseLetter()
) // end of group 2
.lowercaseLetter(RegexQuantifier.oneOrMore),
RegexQuantifier.oneOrMore
) // end of group 1
.wordBoundary()
.buildRegex();
const match = regex.exec("sorry Dave, I can't let you do that");
const name = match[1]; // "Dave"
const initial = match[2]; // "D"
Non-capturing groups can be used for applying a quantifier to a section of the regex, but cannot be extracted later from the object returned by RegExp.exec()
. This can be useful if you have more than one group in a regex, and you don't want to a group that's purely for quantifiers to disrupt the indices of your capturing groups.
Example:
const regex = new RegexBuilder()
.nonCapturingGroup(r => r
.letter()
.digit()
)
.buildRegex();
Named groups enable you to use meaningful names rather than array indices to retrieve captured group values.
Example:
const regex = new RegexBuilder()
.namedGroup("firstName", r => r
.uppercaseLetter()
.lowercaseLetter(RegexQuantifier.oneOrMore)
)
.buildRegex();
const match = regex.exec("say hello to Mark");
const firstName = match.groups.firstName;
RegexToolbox: Now you can be a hero without knowing regular expressions.