|
| 1 | +# CacheToolsUtils Reference |
| 2 | + |
| 3 | +This module provide the following cache wrappers suitable to use with |
| 4 | +`cachetools`: |
| 5 | + |
| 6 | +- Some classes provide actual storage or API to actual storage. |
| 7 | + For this purpose a cache is basically a key-value store, aka a dictionary, |
| 8 | + possibly with some constraints on keys (type, size) and values (size, |
| 9 | + serialization). |
| 10 | + |
| 11 | +- Other classes add features on top of these, such as using a prefix so that |
| 12 | + a storage can be shared without collisions or keeping usage and efficiency |
| 13 | + statistics. |
| 14 | + |
| 15 | +Install with `pip install CacheToolsUtils` or any other relevant mean. |
| 16 | + |
| 17 | +## LockedCache |
| 18 | + |
| 19 | +A cache with a lock, that can be shared between threads. |
| 20 | +Although there is `lock` option in `cachetools` `cached` decorator, it is at |
| 21 | +the function level thus does not work properly if a cache is shared between |
| 22 | +functions. |
| 23 | + |
| 24 | +```python |
| 25 | +import threading |
| 26 | +import cachetools |
| 27 | +import CacheToolsUtils as ctu |
| 28 | + |
| 29 | +lcache = ctu.LockedCache(cachetools.TTLCache(...), threading.Lock()) |
| 30 | +``` |
| 31 | + |
| 32 | +## PrefixedCache |
| 33 | + |
| 34 | +Add a key prefix to an underlying cache to avoid key collisions. |
| 35 | + |
| 36 | +```python |
| 37 | +ct_base = cachetools.TTLCache(maxsize=1048576, ttl=600) |
| 38 | +foo_cache = ctu.PrefixedCache(ct_base, "foo.") |
| 39 | +bla_cache = ctu.PrefixedCache(ct_base, "bla.") |
| 40 | + |
| 41 | +@cachetools.cached(cache=foo_cache) |
| 42 | +def foo(…): |
| 43 | + return … |
| 44 | + |
| 45 | +@cachetools.cached(cache=bla_cache) |
| 46 | +def bla(…): |
| 47 | + return … |
| 48 | +``` |
| 49 | + |
| 50 | +## AutoPrefixedCache |
| 51 | + |
| 52 | +Add a counter-based prefix to an underlying cache to avoid key collisions. |
| 53 | + |
| 54 | +```python |
| 55 | +ttl_cache = cachetools.TTLCache(maxsize=1048576, ttl=120) |
| 56 | + |
| 57 | +@cachetools.cached(cache=AutoPrefixedCache(ttl_cache)) |
| 58 | +def foo(…): |
| 59 | + return … |
| 60 | + |
| 61 | +@cachetools.cached(cache=AutoPrefixedCache(ttl_cache)) |
| 62 | +def bla(…): |
| 63 | + return … |
| 64 | +``` |
| 65 | + |
| 66 | +## StatsCache |
| 67 | + |
| 68 | +Keep stats, cache hit rate shown with `hits()`. |
| 69 | + |
| 70 | +```python |
| 71 | +scache = ctu.StatsCache(cache) |
| 72 | +``` |
| 73 | + |
| 74 | +## TwoLevelCache |
| 75 | + |
| 76 | +Two-level cache, for instance a local in-memory cachetools cache for the first |
| 77 | +level, and a larger shared `redis` or `memcached` distributed cache for the |
| 78 | +second level. |
| 79 | +Whether such setting can bring performance benefits is an open question. |
| 80 | + |
| 81 | +```python |
| 82 | +cache = ctu.TwoLevelCache(TTLCache(…), RedisCache(…)) |
| 83 | +``` |
| 84 | + |
| 85 | +There should be some consistency between the two level configurations |
| 86 | +so that it makes sense. For instance, having two TTL-ed stores would |
| 87 | +suggest that the secondary has a longer TTL than the primary. |
| 88 | + |
| 89 | +There is an additional `resilient` boolean option to the constructor to |
| 90 | +ignore errors on the second level cache, switching reliance on the first |
| 91 | +cache only if the second one fails. Note that this does not mean that |
| 92 | +the system would recover if the second level is back online later, because |
| 93 | +there is no provision to manage reconnections and the like at this level. |
| 94 | +The second level may manage that on its own, though. |
| 95 | + |
| 96 | +## EncryptedCache |
| 97 | + |
| 98 | +A wrapper to add an hash and encryption layer on bytes key-values. |
| 99 | +The design is write only, i.e. the cache contents _cannot_ be extracted |
| 100 | +with the _secret_ only: |
| 101 | + |
| 102 | +- keys are _hashed_ to have fixed-size keys, thus cannot be recovered simply. |
| 103 | +- values are encrypted depending on the actual key value, thus cannot be |
| 104 | + recovered without the key. |
| 105 | + |
| 106 | +Hashing is based on _SHA3_, encryption uses _Salsa20_, _AES-128-CBC_ or _ChaCha20_. |
| 107 | +The value length is somehow more or less leaked. |
| 108 | + |
| 109 | +```python |
| 110 | +cache = EncryptedCache(actual_cache, secret=b"super secret stuff you cannot guess", hsize=16, csize=0) |
| 111 | +``` |
| 112 | + |
| 113 | +Hash size `hsize` can be extended up to _32_, key collision probability is $2^{-4 s}}$. |
| 114 | +An optional value checksum can be triggered by setting `csize`. |
| 115 | + |
| 116 | +The point of this class is to bring security to cached data on distributed |
| 117 | +systems such as Redis. There is no much point to encrypting in-memory caches. |
| 118 | +All of this is very nice, but it costs cycles thus money, do you really want to |
| 119 | +pay for them? |
| 120 | + |
| 121 | +When used with `cached`, the key is expected to be simple bytes for encryption. |
| 122 | +Consider `ToBytesCache` to trigger byte conversions. |
| 123 | +The output is also bytes, which may or may not suit the underlying cache, consider |
| 124 | +`BytesCache` if necessary, or using the `raw` option on `RedisCache`. |
| 125 | + |
| 126 | +```python |
| 127 | +actual = redis.Redis(…) |
| 128 | +red = ctu.RedisCache(actual, raw=True) |
| 129 | +enc = ctu.EncryptedCache(red, b"…") |
| 130 | +cache = ctu.ToBytesCache(enc) |
| 131 | + |
| 132 | +@cached(cache=PrefixedCache(cache, "foo.")) |
| 133 | +def foo(what, ever): |
| 134 | + return … |
| 135 | +``` |
| 136 | + |
| 137 | +## ToBytesCache and BytesCache |
| 138 | + |
| 139 | +Map keys and values to bytes, or |
| 140 | +handle bytes keys and values and map them to strings. |
| 141 | + |
| 142 | +## MemCached |
| 143 | + |
| 144 | +Basic wrapper, possibly with JSON key encoding thanks to the `JsonSerde` class. |
| 145 | +Also add a `hits()` method to compute the cache hit ratio with data taken from |
| 146 | +the memcached server. |
| 147 | + |
| 148 | +```python |
| 149 | +import pymemcache as pmc |
| 150 | + |
| 151 | +mc_base = pmc.Client(server="localhost", serde=ctu.JsonSerde()) |
| 152 | +cache = ctu.MemCached(mc_base) |
| 153 | + |
| 154 | +@cachetools.cached(cache=cache) |
| 155 | +def poc(…): |
| 156 | +``` |
| 157 | + |
| 158 | +Keep in mind MemCached limitations: key size is limited to 250 bytes strings where |
| 159 | +some characters cannot be used, eg spaces, which suggest some encoding |
| 160 | +such as base64, further reducing the actual key size; value size is 1 MiB by default. |
| 161 | + |
| 162 | +## PrefixedMemCached |
| 163 | + |
| 164 | +Wrapper with a prefix. |
| 165 | +A specific class is needed because of necessary key encoding. |
| 166 | + |
| 167 | +```python |
| 168 | +pcache = ctu.PrefixedMemCached(mc_base, prefix="pic.") |
| 169 | +``` |
| 170 | + |
| 171 | +## RedisCache |
| 172 | + |
| 173 | +TTL'ed Redis wrapper, default ttl is 10 minutes and key/value JSON serialization. |
| 174 | +Also adds a `hits()` method to compute the cache hit ratio with data taken |
| 175 | +from the Redis server. |
| 176 | + |
| 177 | +```python |
| 178 | +import redis |
| 179 | + |
| 180 | +rd_base = redis.Redis(host="localhost") |
| 181 | +cache = ctu.RedisCache(rd_base, ttl=60) |
| 182 | +``` |
| 183 | + |
| 184 | +Redis stores arbitrary bytes. Key and values can be up to 512 MiB. |
| 185 | +Keeping keys under 1 KiB seems reasonable. |
| 186 | +Option `raw` allows to skip the serialization step, if you know that |
| 187 | +keys and values are scalars. |
| 188 | + |
| 189 | +## PrefixedRedisCache |
| 190 | + |
| 191 | +Wrapper with a prefix *and* a ttl. |
| 192 | +A specific class is needed because of key encoding and value |
| 193 | +serialization and deserialization. |
| 194 | + |
| 195 | +```python |
| 196 | +pcache = ctu.PrefixedRedisCache(rd_base, "pac.", ttl=3600) |
| 197 | +``` |
| 198 | + |
| 199 | +## Functions cacheMethods and cacheFunctions |
| 200 | + |
| 201 | +This utility function create a prefixed cache around methods of an object |
| 202 | +or functions in the global scope. |
| 203 | +First parameter is the actual cache, second parameter is the object or scope, |
| 204 | +`opts` named-parameter allows additional options to `cached`, |
| 205 | +and finally a keyword mapping from function names to prefixes. |
| 206 | + |
| 207 | +```python |
| 208 | +# add cache to obj.get_data and obj.get_some |
| 209 | +ctu.cacheMethods(cache, obj, get_data="1.", get_some="2.") |
| 210 | + |
| 211 | +# add cache to some_func |
| 212 | +ctu.cacheFunctions(cache, globals(), opts={"key": ctu.json_key}, some_func="f.") |
| 213 | +``` |
| 214 | + |
| 215 | +## Decorator cached |
| 216 | + |
| 217 | +This is an extension of `cachetools` `cached` decorator, with two additions: |
| 218 | + |
| 219 | +- `cache_in` tests whether these function parameters are in cache |
| 220 | +- `cache_del` removes the cache entry |
| 221 | + |
| 222 | +```python |
| 223 | +import CacheToolsUtils as ctu |
| 224 | + |
| 225 | +cache = ... |
| 226 | + |
| 227 | +@ctu.cached(cache=cache) |
| 228 | +def acme(what: str, count: int) -> str: |
| 229 | + return ... |
| 230 | + |
| 231 | +print(acme("hello", 3)) |
| 232 | +assert acme.cache_in("hello", 3) |
| 233 | +print(acme("hello", 3)) |
| 234 | +acme.cache_del("hello", 3) |
| 235 | +assert not acme.cache_in("hello", 3) |
| 236 | +``` |
0 commit comments