Skip to content

Commit d69199e

Browse files
committed
Add regular expression matching algorithm.
1 parent c96bbdf commit d69199e

File tree

4 files changed

+244
-0
lines changed

4 files changed

+244
-0
lines changed

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ a set of rules that precisely define a sequence of operations.
7676
* `A` [Z Algorithm](src/algorithms/string/z-algorithm) - substring search (pattern matching)
7777
* `A` [Rabin Karp Algorithm](src/algorithms/string/rabin-karp) - substring search
7878
* `A` [Longest Common Substring](src/algorithms/string/longest-common-substring)
79+
* `A` [Regular Expression Matching](src/algorithms/string/regular-expression-matching)
7980
* **Searches**
8081
* `B` [Linear Search](src/algorithms/search/linear-search)
8182
* `B` [Binary Search](src/algorithms/search/binary-search)
@@ -147,6 +148,7 @@ algorithm is an abstraction higher than a computer program.
147148
* `A` [Integer Partition](src/algorithms/math/integer-partition)
148149
* `A` [Maximum Subarray](src/algorithms/sets/maximum-subarray)
149150
* `A` [Bellman-Ford Algorithm](src/algorithms/graph/bellman-ford) - finding shortest path to all graph vertices
151+
* `A` [Regular Expression Matching](src/algorithms/string/regular-expression-matching)
150152
* **Backtracking** - similarly to brute force, try to generate all possible solutions, but each time you generate next solution you test
151153
if it satisfies all conditions, and only then continue generating subsequent solutions. Otherwise, backtrack, and go on a
152154
different path of finding a solution. Normally the DFS traversal of state-space is being used.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Regular Expression Matching
2+
3+
Given an input string `s` and a pattern `p`, implement regular
4+
expression matching with support for `.` and `*`.
5+
6+
- `.` Matches any single character.
7+
- `*` Matches zero or more of the preceding element.
8+
9+
The matching should cover the **entire** input string (not partial).
10+
11+
**Note**
12+
13+
- `s` could be empty and contains only lowercase letters `a-z`.
14+
- `p` could be empty and contains only lowercase letters `a-z`, and characters like `.` or `*`.
15+
16+
## Examples
17+
18+
**Example #1**
19+
20+
Input:
21+
```
22+
s = 'aa'
23+
p = 'a'
24+
```
25+
26+
Output: `false`
27+
28+
Explanation: `a` does not match the entire string `aa`.
29+
30+
**Example #2**
31+
32+
Input:
33+
```
34+
s = 'aa'
35+
p = 'a*'
36+
```
37+
38+
Output: `true`
39+
40+
Explanation: `*` means zero or more of the preceding element, `a`.
41+
Therefore, by repeating `a` once, it becomes `aa`.
42+
43+
**Example #3**
44+
45+
Input:
46+
47+
```
48+
s = 'ab'
49+
p = '.*'
50+
```
51+
52+
Output: `true`
53+
54+
Explanation: `.*` means "zero or more (`*`) of any character (`.`)".
55+
56+
**Example #4**
57+
58+
Input:
59+
60+
```
61+
s = 'aab'
62+
p = 'c*a*b'
63+
```
64+
65+
Output: `true`
66+
67+
Explanation: `c` can be repeated 0 times, `a` can be repeated
68+
1 time. Therefore it matches `aab`.
69+
70+
## References
71+
72+
- [YouTube](https://www.youtube.com/watch?v=l3hda49XcDE&list=PLLXdhg_r2hKA7DPDsunoDZ-Z769jWn4R8&index=71&t=0s)
73+
- [LeetCode](https://leetcode.com/problems/regular-expression-matching/description/)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
import regularExpressionMatching from '../regularExpressionMatching';
2+
3+
describe('regularExpressionMatching', () => {
4+
it('should match regular expressions in a string', () => {
5+
expect(regularExpressionMatching('', '')).toBeTruthy();
6+
expect(regularExpressionMatching('a', 'a')).toBeTruthy();
7+
expect(regularExpressionMatching('aa', 'aa')).toBeTruthy();
8+
expect(regularExpressionMatching('aab', 'aab')).toBeTruthy();
9+
expect(regularExpressionMatching('aab', 'aa.')).toBeTruthy();
10+
expect(regularExpressionMatching('aab', '.a.')).toBeTruthy();
11+
expect(regularExpressionMatching('aab', '...')).toBeTruthy();
12+
expect(regularExpressionMatching('a', 'a*')).toBeTruthy();
13+
expect(regularExpressionMatching('aaa', 'a*')).toBeTruthy();
14+
expect(regularExpressionMatching('aaab', 'a*b')).toBeTruthy();
15+
expect(regularExpressionMatching('aaabb', 'a*b*')).toBeTruthy();
16+
expect(regularExpressionMatching('aaabb', 'a*b*c*')).toBeTruthy();
17+
expect(regularExpressionMatching('', 'a*')).toBeTruthy();
18+
expect(regularExpressionMatching('xaabyc', 'xa*b.c')).toBeTruthy();
19+
expect(regularExpressionMatching('aab', 'c*a*b*')).toBeTruthy();
20+
expect(regularExpressionMatching('mississippi', 'mis*is*.p*.')).toBeTruthy();
21+
expect(regularExpressionMatching('ab', '.*')).toBeTruthy();
22+
23+
expect(regularExpressionMatching('', 'a')).toBeFalsy();
24+
expect(regularExpressionMatching('a', '')).toBeFalsy();
25+
expect(regularExpressionMatching('aab', 'aa')).toBeFalsy();
26+
expect(regularExpressionMatching('aab', 'baa')).toBeFalsy();
27+
expect(regularExpressionMatching('aabc', '...')).toBeFalsy();
28+
expect(regularExpressionMatching('aaabbdd', 'a*b*c*')).toBeFalsy();
29+
expect(regularExpressionMatching('mississippi', 'mis*is*p*.')).toBeFalsy();
30+
expect(regularExpressionMatching('ab', 'a*')).toBeFalsy();
31+
expect(regularExpressionMatching('abba', 'a*b*.c')).toBeFalsy();
32+
expect(regularExpressionMatching('abba', '.*c')).toBeFalsy();
33+
});
34+
});
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
const ZERO_OR_MORE_CHARS = '*';
2+
const ANY_CHAR = '.';
3+
4+
/**
5+
* Dynamic programming approach.
6+
*
7+
* @param {string} string
8+
* @param {string} pattern
9+
* @return {boolean}
10+
*/
11+
export default function regularExpressionMatching(string, pattern) {
12+
/*
13+
* Let's initiate dynamic programming matrix for this string and pattern.
14+
* We will have pattern characters on top (as columns) and string characters
15+
* will be placed to the left of the table (as rows).
16+
*
17+
* Example:
18+
*
19+
* a * b . b
20+
* - - - - - -
21+
* a - - - - - -
22+
* a - - - - - -
23+
* b - - - - - -
24+
* y - - - - - -
25+
* b - - - - - -
26+
*/
27+
const matchMatrix = Array(string.length + 1).fill(null).map(() => {
28+
return Array(pattern.length + 1).fill(null);
29+
});
30+
31+
// Let's fill the top-left cell with true. This would mean that empty
32+
// string '' matches to empty pattern ''.
33+
matchMatrix[0][0] = true;
34+
35+
// Let's fill the first row of the matrix with false. That would mean that
36+
// empty string can't match any non-empty pattern.
37+
//
38+
// Example:
39+
// string: ''
40+
// pattern: 'a.z'
41+
//
42+
// The one exception here is patterns like a*b* that matches the empty string.
43+
for (let columnIndex = 1; columnIndex <= pattern.length; columnIndex += 1) {
44+
const patternIndex = columnIndex - 1;
45+
46+
if (pattern[patternIndex] === ZERO_OR_MORE_CHARS) {
47+
matchMatrix[0][columnIndex] = matchMatrix[0][columnIndex - 2];
48+
} else {
49+
matchMatrix[0][columnIndex] = false;
50+
}
51+
}
52+
53+
// Let's fill the first column with false. That would mean that empty pattern
54+
// can't match any non-empty string.
55+
//
56+
// Example:
57+
// string: 'ab'
58+
// pattern: ''
59+
for (let rowIndex = 1; rowIndex <= string.length; rowIndex += 1) {
60+
matchMatrix[rowIndex][0] = false;
61+
}
62+
63+
// Not let's go through every letter of the pattern and every letter of
64+
// the string and compare them one by one.
65+
for (let rowIndex = 1; rowIndex <= string.length; rowIndex += 1) {
66+
for (let columnIndex = 1; columnIndex <= pattern.length; columnIndex += 1) {
67+
// Take into account that fact that matrix contain one extra column and row.
68+
const stringIndex = rowIndex - 1;
69+
const patternIndex = columnIndex - 1;
70+
71+
if (pattern[patternIndex] === ZERO_OR_MORE_CHARS) {
72+
/*
73+
* In case if current pattern character is special '*' character we have
74+
* two options:
75+
*
76+
* 1. Since * char allows it previous char to not be presented in a string we
77+
* need to check if string matches the pattern without '*' char and without the
78+
* char that goes before '*'. That would mean to go two positions left on the
79+
* same row.
80+
*
81+
* 2. Since * char allows it previous char to be presented in a string many times we
82+
* need to check if char before * is the same as current string char. If they are the
83+
* same that would mean that current string matches the current pattern in case if
84+
* the string WITHOUT current char matches the same pattern. This would mean to go
85+
* one position up in the same row.
86+
*/
87+
if (matchMatrix[rowIndex][columnIndex - 2] === true) {
88+
matchMatrix[rowIndex][columnIndex] = true;
89+
} else if (
90+
(
91+
pattern[patternIndex - 1] === string[stringIndex] ||
92+
pattern[patternIndex - 1] === ANY_CHAR
93+
) &&
94+
matchMatrix[rowIndex - 1][columnIndex] === true
95+
) {
96+
matchMatrix[rowIndex][columnIndex] = true;
97+
} else {
98+
matchMatrix[rowIndex][columnIndex] = false;
99+
}
100+
} else if (
101+
pattern[patternIndex] === string[stringIndex] ||
102+
pattern[patternIndex] === ANY_CHAR
103+
) {
104+
/*
105+
* In case if current pattern char is the same as current string char
106+
* or it may be any character (in case if pattern contains '.' char)
107+
* we need to check if there was a match for the pattern and for the
108+
* string by WITHOUT current char. This would mean that we may copy
109+
* left-top diagonal value.
110+
*
111+
* Example:
112+
*
113+
* a b
114+
* a 1 -
115+
* b - 1
116+
*/
117+
matchMatrix[rowIndex][columnIndex] = matchMatrix[rowIndex - 1][columnIndex - 1];
118+
} else {
119+
/*
120+
* In case if pattern char and string char are different we may
121+
* treat this case as "no-match".
122+
*
123+
* Example:
124+
*
125+
* a b
126+
* a - -
127+
* c - 0
128+
*/
129+
matchMatrix[rowIndex][columnIndex] = false;
130+
}
131+
}
132+
}
133+
134+
return matchMatrix[string.length][pattern.length];
135+
}

0 commit comments

Comments
 (0)