|
| 1 | +--- |
| 2 | +id: lfu-cache |
| 3 | +title: LFU Cache |
| 4 | +sidebar_label: 0460 - LFU Cache |
| 5 | +tags: |
| 6 | +- Hash Table |
| 7 | +- Linked List |
| 8 | +- Ordered Set |
| 9 | +description: "This is a solution to the LFU Cache problem on LeetCode." |
| 10 | +--- |
| 11 | + |
| 12 | +## Problem Description |
| 13 | +Design and implement a data structure for a Least Frequently Used (LFU) cache. |
| 14 | + |
| 15 | +Implement the `LFUCache` class: |
| 16 | + |
| 17 | +- `LFUCache(int capacity)` Initializes the object with the `capacity` of the data structure. |
| 18 | +- `int get(int key)` Gets the value of the `key` if the key exists in the cache. Otherwise, returns -1. |
| 19 | +- `void put(int key, int value)` Update the value of the key if present, or inserts the `key` if not already present. When the cache reaches its `capacity`, it should invalidate and remove the **least frequently used** key before inserting a new item. For this problem, when there is a **tie** (i.e., two or more keys with the same frequency), **the least recently used** `key` would be invalidated. |
| 20 | + |
| 21 | +To determine the least frequently used key, a **use counter** is maintained for each key in the cache. The key with the smallest **use counter** is the least frequently used key. |
| 22 | + |
| 23 | +When a key is first inserted into the cache, its **use counter** is set to `1` (due to the `put` operation). The **use counter** for a key in the cache is incremented either a `get` or `put` operation is called on it. |
| 24 | + |
| 25 | +The functions `get` and `put` must each run in O(1) average time complexity. |
| 26 | + |
| 27 | + |
| 28 | +### Examples |
| 29 | + |
| 30 | +**Example 1:** |
| 31 | + |
| 32 | +``` |
| 33 | +Input-["LFUCache", "put", "put", "get", "put", "get", "get", "put", "get", "get", "get"] |
| 34 | +[[2], [1, 1], [2, 2], [1], [3, 3], [2], [3], [4, 4], [1], [3], [4]] |
| 35 | +Output-[null, null, null, 1, null, -1, 3, null, -1, 3, 4] |
| 36 | +Explanation: |
| 37 | +// cnt(x) = the use counter for key x |
| 38 | +// cache=[] will show the last used order for tiebreakers (leftmost element is most recent) |
| 39 | +LFUCache lfu = new LFUCache(2); |
| 40 | +lfu.put(1, 1); // cache=[1,_], cnt(1)=1 |
| 41 | +lfu.put(2, 2); // cache=[2,1], cnt(2)=1, cnt(1)=1 |
| 42 | +lfu.get(1); // return 1 |
| 43 | + // cache=[1,2], cnt(2)=1, cnt(1)=2 |
| 44 | +lfu.put(3, 3); // 2 is the LFU key because cnt(2)=1 is the smallest, invalidate 2. |
| 45 | + // cache=[3,1], cnt(3)=1, cnt(1)=2 |
| 46 | +lfu.get(2); // return -1 (not found) |
| 47 | +lfu.get(3); // return 3 |
| 48 | + // cache=[3,1], cnt(3)=2, cnt(1)=2 |
| 49 | +lfu.put(4, 4); // Both 1 and 3 have the same cnt, but 1 is LRU, invalidate 1. |
| 50 | + // cache=[4,3], cnt(4)=1, cnt(3)=2 |
| 51 | +lfu.get(1); // return -1 (not found) |
| 52 | +lfu.get(3); // return 3 |
| 53 | + // cache=[3,4], cnt(4)=1, cnt(3)=3 |
| 54 | +lfu.get(4); // return 4 |
| 55 | + // cache=[4,3], cnt(4)=2, cnt(3)=3 |
| 56 | +``` |
| 57 | + |
| 58 | + |
| 59 | +### Constraints |
| 60 | +- `1 <= capacity <= 104` |
| 61 | +- `0 <= key <= 105` |
| 62 | +- `0 <= value <= 109` |
| 63 | +- At most `2 * 105` calls will be made to `get` and `put`. |
| 64 | + |
| 65 | +## Solution for LFU Cache |
| 66 | + |
| 67 | +### Approach |
| 68 | +#### HashMap and LinkedHashSet: |
| 69 | +##### Intuition |
| 70 | +We need to maintain all the keys, values and frequencies. Without invalidation (removing from the data structure when it reaches capacity), they can be maintained by a HashMap `<Integer, Pair<Integer, Integer>>`, keyed by the original `key` and valued by the `frequency`-`value` pair. |
| 71 | + |
| 72 | +With the invalidation, we need to maintain the current minimum frequency and delete particular keys. Hence, we can group the keys with the same frequency together and maintain another HashMap`<Integer, Set>`, keyed by the frequency and valued by the set of `keys` that have the same frequency. This way, if we know the minimum frequency, we can access the potential keys to be deleted. |
| 73 | + |
| 74 | +Also note that in the case of a tie, we're required to find the least recently used key and invalidate it, hence we need to keep the frequencies ordered in the Set. Instead of using a TreeSet which adds an extra O(log(N)) time complexity, we can maintain the keys using a LinkedList so that it supports finding both an arbitrary key and the least recently used key in constant time. Fortunately, LinkedHashSet can do the job. Once a key is inserted/updated, we put it to the end of the LinkedHashSet so that we can invalidate the first key in the LinkedHashSet corresponding to the minimum frequency. |
| 75 | + |
| 76 | +The original operations can be transformed into operations on the 2 HashMaps, keeping them in sync and maintaining the minimum frequency. |
| 77 | + |
| 78 | +Since C++ lacks LinkedHashSet, we have to use a workaround like maintaining a list of key and value pairs instead of the LinkedHashSet and keeping the iterator with the frequency in another unordered_map to keep this connection. The idea is similar but a little bit complicated. Another workaround would be to implement your own LRU cache with a doubly linked list. |
| 79 | + |
| 80 | +##### Algorithm |
| 81 | +To make things simpler, assume we have 4 member variables: |
| 82 | + |
| 83 | +1. HashMap`<Integer, Pair<Integer, Integer>>` cache, keyed by the original key and valued by the frequency-value pair. |
| 84 | +2. HashMap`<Integer, LinkedListHashSet<Integer>>` frequencies, keyed by frequency and valued by the set of keys that have the same frequency. |
| 85 | +3. `int minf`, which is the minimum frequency at any given time. |
| 86 | +4. `int capacity`, which is the `capacity` given in the input. |
| 87 | + |
| 88 | +It's also convenient to have a private utility function insert to insert a key-value pair with a given frequency. |
| 89 | + |
| 90 | +##### void insert(int key, int frequency, int value) |
| 91 | + |
| 92 | +1. Insert frequency-value pair into cache with the given key. |
| 93 | +2. Get the LinkedHashSet corresponding to the given frequency (default to empty Set) and insert the given key. |
| 94 | + |
| 95 | +##### int get(int key) |
| 96 | + |
| 97 | +1. If the given key is not in the cache, return -1, otherwise go to step 2. |
| 98 | +2. Get the frequency and value from the cache. |
| 99 | +3. Get the LinkedHashSet associated with frequency from frequencies and remove the given key from it, since the usage of the current key is increased by this function call. |
| 100 | +4. If minf == frequency and the above LinkedHashSet is empty, that means there are no more elements used minf times, so increase minf by 1. To save some space, we can also delete the entry frequency from the frequencies hash map. |
| 101 | +5. Call insert(key, frequency + 1, value), since the current key's usage has increased from this function call. |
| 102 | +6. Return value |
| 103 | + |
| 104 | +##### void put(int key, int value) |
| 105 | +1. If capacity < = 0, exit. |
| 106 | +2. If the given key exists in cache, update the value in the original frequency-value (don't call insert here), and then increment the frequency by using get(key). Exit the function. |
| 107 | +3. If cache.size() == capacity, get the first (least recently used) value in the LinkedHashSet corresponding to minf in frequencies, and remove it from cache and the LinkedHashSet. |
| 108 | +4. If we didn't exit the function in step 2, it means that this element is a new one, so the minimum frequency cannot possibly be greater than one. Set minf to 1. |
| 109 | +5. Call insert(key, 1, value) |
| 110 | + |
| 111 | + |
| 112 | +## Code in Different Languages |
| 113 | + |
| 114 | +<Tabs> |
| 115 | +<TabItem value="cpp" label="C++"> |
| 116 | + <SolutionAuthor name="@Shreyash3087"/> |
| 117 | + |
| 118 | +```cpp |
| 119 | +class LFUCache { |
| 120 | + // key: frequency, value: list of original key-value pairs that have the same frequency. |
| 121 | + unordered_map<int, list<pair<int, int>>> frequencies; |
| 122 | + // key: original key, value: pair of frequency and the iterator corresponding key int the |
| 123 | + // frequencies map's list. |
| 124 | + unordered_map<int, pair<int, list<pair<int, int>>::iterator>> cache; |
| 125 | + int capacity; |
| 126 | + int minf; |
| 127 | + |
| 128 | + void insert(int key, int frequency, int value) { |
| 129 | + frequencies[frequency].push_back({key, value}); |
| 130 | + cache[key] = {frequency, --frequencies[frequency].end()}; |
| 131 | + } |
| 132 | + |
| 133 | +public: |
| 134 | + LFUCache(int capacity) : capacity(capacity), minf(0) {} |
| 135 | + |
| 136 | + int get(int key) { |
| 137 | + const auto it = cache.find(key); |
| 138 | + if (it == cache.end()) { |
| 139 | + return -1; |
| 140 | + } |
| 141 | + const int f = it->second.first; |
| 142 | + const auto iter = it->second.second; |
| 143 | + const pair<int, int> kv = *iter; |
| 144 | + frequencies[f].erase(iter); |
| 145 | + if (frequencies[f].empty()){ |
| 146 | + frequencies.erase(f); |
| 147 | + |
| 148 | + if(minf == f) { |
| 149 | + ++minf; |
| 150 | + } |
| 151 | + } |
| 152 | + |
| 153 | + insert(key, f + 1, kv.second); |
| 154 | + return kv.second; |
| 155 | + } |
| 156 | + |
| 157 | + void put(int key, int value) { |
| 158 | + if (capacity <= 0) { |
| 159 | + return; |
| 160 | + } |
| 161 | + const auto it = cache.find(key); |
| 162 | + if (it != cache.end()) { |
| 163 | + it->second.second->second = value; |
| 164 | + get(key); |
| 165 | + return; |
| 166 | + } |
| 167 | + if (capacity == cache.size()) { |
| 168 | + cache.erase(frequencies[minf].front().first); |
| 169 | + frequencies[minf].pop_front(); |
| 170 | + |
| 171 | + if(frequencies[minf].empty()) { |
| 172 | + frequencies.erase(minf); |
| 173 | + } |
| 174 | + } |
| 175 | + |
| 176 | + minf = 1; |
| 177 | + insert(key, 1, value); |
| 178 | + } |
| 179 | +}; |
| 180 | +``` |
| 181 | +</TabItem> |
| 182 | +<TabItem value="java" label="Java"> |
| 183 | + <SolutionAuthor name="@Shreyash3087"/> |
| 184 | +
|
| 185 | +```java |
| 186 | +class LFUCache { |
| 187 | + // key: original key, value: frequency and original value. |
| 188 | + private Map<Integer, Pair<Integer, Integer>> cache; |
| 189 | + // key: frequency, value: All keys that have the same frequency. |
| 190 | + private Map<Integer, LinkedHashSet<Integer>> frequencies; |
| 191 | + private int minf; |
| 192 | + private int capacity; |
| 193 | + |
| 194 | + private void insert(int key, int frequency, int value) { |
| 195 | + cache.put(key, new Pair<>(frequency, value)); |
| 196 | + frequencies.putIfAbsent(frequency, new LinkedHashSet<>()); |
| 197 | + frequencies.get(frequency).add(key); |
| 198 | + } |
| 199 | +
|
| 200 | + public LFUCache(int capacity) { |
| 201 | + cache = new HashMap<>(); |
| 202 | + frequencies = new HashMap<>(); |
| 203 | + minf = 0; |
| 204 | + this.capacity = capacity; |
| 205 | + } |
| 206 | + |
| 207 | + public int get(int key) { |
| 208 | + Pair<Integer, Integer> frequencyAndValue = cache.get(key); |
| 209 | + if (frequencyAndValue == null) { |
| 210 | + return -1; |
| 211 | + } |
| 212 | + final int frequency = frequencyAndValue.getKey(); |
| 213 | + final Set<Integer> keys = frequencies.get(frequency); |
| 214 | + keys.remove(key); |
| 215 | + if (keys.isEmpty()) { |
| 216 | + frequencies.remove(frequency); |
| 217 | + |
| 218 | + if (minf == frequency) { |
| 219 | + ++minf; |
| 220 | + } |
| 221 | + } |
| 222 | + final int value = frequencyAndValue.getValue(); |
| 223 | + insert(key, frequency + 1, value); |
| 224 | + return value; |
| 225 | + } |
| 226 | + |
| 227 | + public void put(int key, int value) { |
| 228 | + if (capacity <= 0) { |
| 229 | + return; |
| 230 | + } |
| 231 | + Pair<Integer, Integer> frequencyAndValue = cache.get(key); |
| 232 | + if (frequencyAndValue != null) { |
| 233 | + cache.put(key, new Pair<>(frequencyAndValue.getKey(), value)); |
| 234 | + get(key); |
| 235 | + return; |
| 236 | + } |
| 237 | + if (capacity == cache.size()) { |
| 238 | + final Set<Integer> keys = frequencies.get(minf); |
| 239 | + final int keyToDelete = keys.iterator().next(); |
| 240 | + cache.remove(keyToDelete); |
| 241 | + keys.remove(keyToDelete); |
| 242 | + if (keys.isEmpty()) { |
| 243 | + frequencies.remove(minf); |
| 244 | + } |
| 245 | + } |
| 246 | + minf = 1; |
| 247 | + insert(key, 1, value); |
| 248 | + } |
| 249 | +} |
| 250 | +``` |
| 251 | + |
| 252 | +</TabItem> |
| 253 | +<TabItem value="python" label="Python"> |
| 254 | + <SolutionAuthor name="@Shreyash3087"/> |
| 255 | + |
| 256 | +```python |
| 257 | +import collections |
| 258 | + |
| 259 | +class Node: |
| 260 | + def __init__(self, key, val): |
| 261 | + self.key = key |
| 262 | + self.val = val |
| 263 | + self.freq = 1 |
| 264 | + self.prev = self.next = None |
| 265 | + |
| 266 | +class DLinkedList: |
| 267 | + |
| 268 | + def __init__(self): |
| 269 | + self._sentinel = Node(None, None) # dummy node |
| 270 | + self._sentinel.next = self._sentinel.prev = self._sentinel |
| 271 | + self._size = 0 |
| 272 | + |
| 273 | + def __len__(self): |
| 274 | + return self._size |
| 275 | + |
| 276 | + def append(self, node): |
| 277 | + node.next = self._sentinel.next |
| 278 | + node.prev = self._sentinel |
| 279 | + node.next.prev = node |
| 280 | + self._sentinel.next = node |
| 281 | + self._size += 1 |
| 282 | + |
| 283 | + def pop(self, node=None): |
| 284 | + if self._size == 0: |
| 285 | + return |
| 286 | + |
| 287 | + if not node: |
| 288 | + node = self._sentinel.prev |
| 289 | + |
| 290 | + node.prev.next = node.next |
| 291 | + node.next.prev = node.prev |
| 292 | + self._size -= 1 |
| 293 | + |
| 294 | + return node |
| 295 | + |
| 296 | +class LFUCache: |
| 297 | + def __init__(self, capacity): |
| 298 | + self._size = 0 |
| 299 | + self._capacity = capacity |
| 300 | + |
| 301 | + self._node = dict() # key: Node |
| 302 | + self._freq = collections.defaultdict(DLinkedList) |
| 303 | + self._minfreq = 0 |
| 304 | + |
| 305 | + |
| 306 | + def _update(self, node): |
| 307 | + |
| 308 | + freq = node.freq |
| 309 | + |
| 310 | + self._freq[freq].pop(node) |
| 311 | + if self._minfreq == freq and not self._freq[freq]: |
| 312 | + self._minfreq += 1 |
| 313 | + |
| 314 | + node.freq += 1 |
| 315 | + freq = node.freq |
| 316 | + self._freq[freq].append(node) |
| 317 | + |
| 318 | + def get(self, key): |
| 319 | + if key not in self._node: |
| 320 | + return -1 |
| 321 | + |
| 322 | + node = self._node[key] |
| 323 | + self._update(node) |
| 324 | + return node.val |
| 325 | + |
| 326 | + def put(self, key, value): |
| 327 | + |
| 328 | + if self._capacity == 0: |
| 329 | + return |
| 330 | + |
| 331 | + if key in self._node: |
| 332 | + node = self._node[key] |
| 333 | + self._update(node) |
| 334 | + node.val = value |
| 335 | + else: |
| 336 | + if self._size == self._capacity: |
| 337 | + node = self._freq[self._minfreq].pop() |
| 338 | + del self._node[node.key] |
| 339 | + self._size -= 1 |
| 340 | + |
| 341 | + node = Node(key, value) |
| 342 | + self._node[key] = node |
| 343 | + self._freq[1].append(node) |
| 344 | + self._minfreq = 1 |
| 345 | + self._size += 1 |
| 346 | +``` |
| 347 | +</TabItem> |
| 348 | +</Tabs> |
| 349 | + |
| 350 | +## Complexity Analysis |
| 351 | + ### Time Complexity: $O(1)$ |
| 352 | + > **Reason:** We only have basic HashMap/(Linked)HashSet operations |
| 353 | + ### Space Complexity: $O(N)$ |
| 354 | + > **Reason:** We save all the key-value pairs as well as all the keys with frequencies in the 2 HashMaps (plus a LinkedHashSet), so there are at most $min(N, capacity) keys and values at any given time. |
| 355 | +
|
| 356 | +## References |
| 357 | + |
| 358 | +- **LeetCode Problem**: [LFU Cache](https://leetcode.com/problems/lfu-cache/description/) |
| 359 | + |
| 360 | +- **Solution Link**: [LFU Cache](https://leetcode.com/problems/lfu-cache/solutions/) |
| 361 | + |
0 commit comments