Skip to content

Commit 40cd5b1

Browse files
Merge branch 'main' of https://github.com/contextlab/leetcode-solutions into main
2 parents 09f4819 + ad90060 commit 40cd5b1

File tree

2 files changed

+112
-0
lines changed

2 files changed

+112
-0
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Each day (ideally) we'll attempt the daily [leetcode](https://leetcode.com) prob
2222
| July 11, 2024 | [1190](https://leetcode.com/problems/reverse-substrings-between-each-pair-of-parentheses/description/?envType=daily-question) | [Click here](https://github.com/ContextLab/leetcode-solutions/tree/main/problems/1190) | 🟡 Medium |
2323
| July 12, 2024 | [1717](https://leetcode.com/problems/maximum-score-from-removing-substrings/description/?envType=daily-question) | [Click here](https://github.com/ContextLab/leetcode-solutions/tree/main/problems/1717) | 🟡 Medium |
2424
| July 13, 2024 | [2751](https://leetcode.com/problems/robot-collisions/description/?envType=daily-question) | [Click here](https://github.com/ContextLab/leetcode-solutions/tree/main/problems/2751) | 🔴 Hard |
25+
| July 14, 2024 | [726](https://leetcode.com/problems/number-of-atoms/description/?envType=daily-question) | [Click here](https://github.com/ContextLab/leetcode-solutions/tree/main/problems/726) | 🔴 Hard |
2526

2627
# Join our discussion!
2728

problems/726/jeremymanning.md

+111
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# [Problem 726: Number of Atoms](https://leetcode.com/problems/number-of-atoms/description/?envType=daily-question)
2+
3+
## Initial thoughts (stream-of-consciousness)
4+
- We are not dealing with super long formulas, so we can afford to be a little inefficient (if needed)
5+
- I think we should start by "tokenizing" the formula by splitting it into:
6+
- Elements (capital letter followed by 0+ lowercase letters)
7+
- Numbers (consecutive sequences of digits; convert these to integers)
8+
- Parentheses
9+
- Tokenizing will allow us to do bookkeeping more easily
10+
- I also think it'd be worth doing a first pass to identify the positions of all the parentheses. In $O(n)$ time we could start a counter at 0 and then move through the string character by character. Each time we hit a "(" we increment the counter and add the position to a hash table (key: counter value? position?; value: could either be the position of the matching closing parentheses, or a list where the first element is the position of the open parenthesis and the second element is the position of the matching close parenthesis). If we hit a ")" we decrement the counter and update the hash table accordingly. This will let us easily do recursion later:
11+
- When we're doing the main processing, if we hit "(" we can get from the hash table the entire contents (up to its matching ")"), run our helper counting function on that sub-string, and then add it to our running total. (The running total, btw, should also be stored in a hash table. Aside: I'm not sure if dicts can be added, or how that works if they don't have exactly the same keys; need to figure this out...)
12+
- Once we've finished processing the content inside the parentheses, we can skip ahead to after the parentheses
13+
- This will save a lot of time, because we won't have to keep scanning forward (potentially recursively) to match up parentheses
14+
- Note: the "helper" function (i.e., the function called recursively) will need to have an "offset" parameter (default: 0) to enable us to avoid needing to re-compute the parenthesis matching each time we enter a new recursion depth. E.g. something like `close_pos = parens[i + offset] - offset`. And then if we encounter nested parentheses, we'd need to pass in `offset = i + offset` to the recursion call.
15+
- Then I think the basic approach is straightforward:
16+
- Tokenize the string
17+
- Create a hash table for the parentheses pairings
18+
- Start a hash table with the atom counts:
19+
- This could either be created during the tokenization process (e.g., whenever an element is found, add a key for that element and initialize its count to 0), or we could just initialize the hash table to an empty dict and add new elements as needed if they haven't already been accounted for.
20+
- Set `current` to `{}` (used to process digits)
21+
- Then go through each token one by one:
22+
- If we encounter an element (`x`):
23+
- Add `current` to the running totals
24+
- update `current` to `{x: 1}`
25+
- If we encounter a number (`i`):
26+
- Multiply every value in `current` by `i`
27+
- Add `current` to the running totals
28+
- Reset `current` to `{}`
29+
- If we encounter a parenthesis:
30+
- `current = helper(<get contents of parens>, offset=i + offset)`
31+
- At the end of the helper function, add `current` to the total and then return the total counts (a dict)
32+
- Finally, put the output in the right format:
33+
- Let's say that `counts` is the element-wise counts
34+
- `counts = sorted([[key, val] for key, val in counts.items], key=lambda x: x[0])`
35+
- `return ''.join([f'{x[0]x[1]}' if x[1] > 1 else x[0] for x in counts])`
36+
37+
## Refining the problem, round 2 thoughts
38+
- Some helper functions are needed:
39+
- Tokenize the formula-- take in the formula and return a list of tokens
40+
- This might have some tricky parts to it
41+
- What I'm imagining is that we initialize `t` (current token) to an empty string and then go through character by character (current character: `c`):
42+
- If `c in "()"`:
43+
- append `c` to the current list of parsed tokens
44+
- set `t = ''`
45+
- If `c` is a capital letter:
46+
- if `len(t) > 0`:
47+
- if `t[0]` is a digit:
48+
- `t = int(t)`
49+
- append `t` to the current list of parsed tokens
50+
- otherwise, if `t[0]` is a lowercase or capital letter, append `t` to the current list of parsed tokens
51+
- reset `t` to `''`
52+
- set `t = c`
53+
- If `c` is a lowercase letter, `t += c`
54+
- If `c` is a digit:
55+
- If `len(t) > 0`:
56+
- If `t[-1]` is also a digit, `t += c`
57+
- Else:
58+
- Append `t` to the current list of parsed tokens
59+
- Set `t = c`
60+
- Otherwise `t = c`
61+
- At the end, make sure to add `t` to the list of tokens if it's not empty. (If it's a digit, convert to an `int` first.)
62+
- Then just return the list of parsed tokens
63+
- Parenthesis matching function
64+
- A potentially tricky case could arise, whereby the "depth" for several parenthesis pairs is the same. E.g., for the formula "X(XX)XXX(XX)XXXX..." both parenthesis pairs have the same depth. I think a hash table is still the "right" way to handle parenthesis matching, but instead of using 2-element lists of ints, maybe we should instead use lists of 2-element lists. Then as we use each new pair, we'll just dequeue it from the front of that entry in the hash table so that we don't need to continually match up the current position with all of the entries.
65+
- Add two dicts, potentially with mismatched keys-- take in two count dicts and return a single "merged" count dict
66+
- Multiply a dict by a constant-- take in a count dict and an integer and return a new count dict with updated values
67+
- Main helper function-- take in a list of tokens and an offset (default: 0) and return a count dict
68+
- I might be missing an edge case...but if not, nothing here is too crazy. There are just a bunch of pieces to this problem (more than the usual short solutions).
69+
70+
## Attempted solution(s)
71+
```python
72+
class Solution:
73+
def countOfAtoms(self, formula: str) -> str:
74+
def tokenize(formula): # double check logic here
75+
digits = '0123456789'
76+
lowercase = 'abcdefghijklmnopqrstuvwxyz'
77+
uppercase = lowercase.upper()
78+
parentheses = '()'
79+
80+
tokens = []
81+
t = ''
82+
for c in formula:
83+
if c in parentheses:
84+
tokens.append(c)
85+
t = ''
86+
elif c in uppercase: # note for later: need to fix this...
87+
if len(t) > 0:
88+
if t[0] in digits:
89+
t = int(t)
90+
tokens.append(t)
91+
t = c
92+
elif c in lowercase:
93+
t += c
94+
else: # c is a digit
95+
if len(t) > 0:
96+
if t[-1] in digits:
97+
t += c
98+
else:
99+
tokens.append(t)
100+
t = c
101+
else:
102+
t = c
103+
104+
if len(t) > 0:
105+
if t[0] in digits:
106+
tokens.append(int(t))
107+
else:
108+
tokens.append(t)
109+
110+
return tokens
111+
```

0 commit comments

Comments
 (0)