You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
APCng storage - removed APCUIterator usage while collecting metrics, removed metrics collecting lag (#96)
* APCng storage - removed APCUIterator usage while collecting metrics, removed metrics collecting lag
Signed-off-by: Rastusik <[email protected]>
* fixed tests for APCng storage changes with no need for TTL for meta info generation
Signed-off-by: Rastusik <[email protected]>
* allowed phpstan extension installer plugin in composer, so CI pipeline can work again
Signed-off-by: Rastusik <[email protected]>
* APCng - some little optimisations for persistent memory workloads (e.g. swoole), better concurrent data updates, fixed tests so they work with floats correctly
Signed-off-by: Rastusik <[email protected]>
* small fixes for phpstan runner to pass
Signed-off-by: Rastusik <[email protected]>
* acpng store fix - all data have fixed ttl of 0, so no global apcu setting can make the metrics expire
Signed-off-by: Rastusik <[email protected]>
Signed-off-by: Rastusik <[email protected]>
Copy file name to clipboardExpand all lines: README.APCng.md
-2
Original file line number
Diff line number
Diff line change
@@ -62,5 +62,3 @@ Without going into excruciating detail (you can read the source for that!), the
62
62
The approach `APCng` takes is to keep a "metadata cache" which stores an array of all the metadata keys, so instead of doing a scan of APCu looking for all matching keys, we just need to retrieve one key, deserialize it (which turns out to be slow), and retrieve all the metadata keys listed in the array. Once we've done that, there is some fancy handwaving which is used to deterministically generate possible sub-keys for each metadata item, based on LabelNames, etc. Not all of these keys exist, but it's quicker to attempt to fetch them and fail, then it is to run another APCUIterator looking for a specific pattern.
63
63
64
64
Summaries, as mentioned before, have a third nested APCUIterator in them, looking for all readings w/o expired TTLs that match a pattern. Again, slow. Instead, we store a "map", similar to the metadata cache, but this one is temporally-keyed: one key per second, which lists how many samples were collected in that second. Once this is done, an expensive APCUIterator match is no longer needed, as all possible keys can be deterministically generated and checked, by retrieving each key for the past 600 seconds (if it exists), extracting the sample-count from the key, and then generating all the APCu keys which would refer to each observed sample.
65
-
66
-
There is the concept of a metadata cache TTL (default: 1 second) which offers a trade-off of performance vs responsiveness. If a collect() call is made and then a new metric is subsequently tracked, the new metric won't show up in subsequent collect() calls until the metadata cache TTL is expired. By keeping this TTL short, we avoid hammering APCu too heavily (remember, deserializing that metainfo cache array is nearly as slow as calling APCUIterator -- it just doesn't slow down as you add more keys to APCu). However we want to cap how long a new metric remains "hidden" from the Prometheus scraper. For best performance, adjust the TTL as high as you can based on your specific use-case. For instance if you're scraping every 10 seconds, then a reasonable TTL could be anywhere from 5 to 10 seconds, meaning a 50 to 100% chance that the metric won't appear in the next full scrape, but it will be there for the following one. Note that the data is tracked just fine during this period - it's just not visible yet, but it will be! You can set the TTL to zero to disable the cache. This will return to `APC` engine behavior, with no delay between creating a metric and being able to collect() it. However, performance will suffer, as the metainfo cache array will need to be deserialized from APCu each time collect() is called -- which might be okay if collect() is called infrequently and you simply must have zero delay in reporting newly-created metrics.
0 commit comments