Cache journals#508
Conversation
Introduce rundetection/cache.py to manage a shared Valkey client and provide JSON get/set helpers with disable-on-error behavior. Refactor EnginX rules to read journal files from a configurable directory, coerce cached JSON back to int keys, and use the Valkey cache (with configurable TTL) to avoid rebuilding the run-number->cycle map on every call. Add unit tests for the cache helpers and EnginX caching behavior and include sample journal XML fixtures. Also expose ARCHIVE_ROOT from env in run_detection.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #508 +/- ##
==========================================
+ Coverage 93.83% 94.89% +1.06%
==========================================
Files 18 19 +1
Lines 1152 1333 +181
==========================================
+ Hits 1081 1265 +184
+ Misses 71 68 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a shared Valkey-backed caching utility for run-detection and applies it to EnginX journal parsing so the derived run-number → cycle mapping can be reused across runs/processes with a TTL, falling back to journal XML parsing when Valkey is unavailable.
Changes:
- Add
rundetection.cacheValkey helper with lazy client creation, JSON get/set, TTL writes, and “disable-on-error” fallback behavior. - Update EnginX journal mapping build to read/write the mapping via Valkey (configurable TTL + journal directory), with tests validating cached vs. uncached paths.
- Extend EnginX test journal fixtures and add new unit tests for the cache module.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
rundetection/cache.py |
New Valkey cache helper for JSON get/set and lazy client creation with fallback behavior. |
rundetection/rules/enginx_rules.py |
Adds Valkey-based caching for EnginX run-number → cycle mapping; makes journal dir/TTL configurable. |
rundetection/run_detection.py |
Adds ARCHIVE_ROOT env-derived constant (currently unused). |
test/test_cache.py |
New unit tests for Valkey cache helpers (client creation, disable-on-error, JSON get/set). |
test/rules/test_enginx_rules.py |
Adds tests covering cached mapping, cache miss parsing + cache population, and TTL-disabled behavior. |
test/test_data/e2e_data/NDXENGINX/Instrument/logs/journal/journal_21_1.xml |
Additional journal fixture for mapping parsing tests. |
test/test_data/e2e_data/NDXENGINX/Instrument/logs/journal/journal_22_1.xml |
Additional journal fixture for mapping parsing tests. |
pyproject.toml |
Adds valkey dependency (and reorders/updates dependency list). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Close and suppress errors on Valkey client shutdown, include cache keys in error logs, and avoid repeated connection attempts by marking the client disabled. Add in-process memoization for EnginX run->cycle mappings (_ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE) with helper functions to read, cache and clear the mapping; avoid caching empty mappings and emit warnings when journal paths are missing, not directories, or contain no XML. Update build_enginx_run_number_cycle_map to prefer Valkey but fall back to the in-process cache and only persist non-empty mappings. Add and adjust unit tests to cover cache disable/close behavior, logging, and EnginX journal memoization; also include minor docstring and formatting cleanups across modules.
Split and clean up EnginX journal parsing: introduce _ENGINX_RUN_NAME_PREFIX constant and helper functions (_add_enginx_run_to_cycle_map, _walk_enginx_journal_node, _enginx_journal_files) to centralize prefix handling, recursive walking, and file discovery. _enginx_journal_files now returns None on missing/invalid directories or no XML files; _read_enginx_run_number_cycle_map uses these helpers and parses journal XMLs into the run->cycle mapping. Tests updated to use a REPEATED_BUILD_CALL_COUNT constant for repeated-call assertions and to match the refactored behavior.
Introduce a _Closable Protocol and import typing.cast to ensure the Valkey client is treated as closable before calling close(). Adjust imports to include Protocol and cast, and tidy a docstring formatting. This change satisfies the type checker while preserving existing lazy client behavior and error handling.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (4)
rundetection/run_detection.py:70
- The
producerdocstring has the:return:field inlined into the summary text, which makes the docstring hard to scan and inconsistent with other docstrings in the repo. Please format the docstring so the summary and:return:(and any params) are on separate lines.
"""
Return a context managed pika producer channel :return:
BlockingChannel.
"""
rundetection/run_detection.py:190
- The
process_notificationsdocstring now inlines:param/:return:into the summary (wrapped across lines). Please revert to the standard docstring layout with summary first, then a blank line, then:param ...:and:return:entries on their own lines for clarity and consistency.
"""
Produce messages until the notification queue is empty :param
notification_queue: The notification queue :return: None.
"""
rundetection/run_detection.py:203
- The
notify_failuresdocstring now inlines:param/:return:into the summary text. Please restore the previous multi-line formatting (summary line(s) first, then parameter/return fields on separate lines) to keep docstrings readable and consistent.
"""
Produce Failure messages until the failure queue is empty :param
failure_queue: The failure Queue :return: None.
"""
rundetection/run_detection.py:259
- The
maindocstring has been collapsed into a single line that mixes the summary and:return:field. Please format it as a standard multi-line docstring (summary on its own line, then:return:on a separate line) to match the project’s usual style.
def main() -> None:
"""Entry point for run detection :return: None."""
|
@Dagonite Why have they been squashed onto less lines? |
Pasarus
left a comment
There was a problem hiding this comment.
I'm going to request a few changes here, there is a lot of overcomplication of naming. A lot of concepts and implementations that are overly complicated and with poor readability, are just not necessary or actively the opposite of what we would want from this software.
I am going to re-review the entire thing when the suggestions and comments have been responded to and changes made.
|
|
||
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_KEY = "run_detection:enginx:run_number_cycle_map" | ||
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_TTL_SECONDS = 60 * 60 | ||
| _ENGINX_RUN_NAME_PREFIX = "ENGINX" |
There was a problem hiding this comment.
| _ENGINX_RUN_NAME_PREFIX = "ENGINX" | |
| ENGINX_PREFIX = "ENGINX" |
| expires_at: float | ||
|
|
||
|
|
||
| _ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE: dict[Path, _EnginxRunNumberCycleMapCacheEntry] = {} |
There was a problem hiding this comment.
| _ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE: dict[Path, _EnginxRunNumberCycleMapCacheEntry] = {} | |
| ENGINX_CACHE: dict[Path, EnginxCacheEntry] = {} |
|
|
||
|
|
||
| @dataclass(slots=True) | ||
| class _EnginxRunNumberCycleMapCacheEntry: |
There was a problem hiding this comment.
| class _EnginxRunNumberCycleMapCacheEntry: | |
| class EnginxCacheEntry: |
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_KEY = "run_detection:enginx:run_number_cycle_map" |
There was a problem hiding this comment.
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_KEY = "run_detection:enginx:run_number_cycle_map" | |
| ENGINX_CACHE_KEY = "run_detection:enginx:run_number_cycle_map" |
| logger = logging.getLogger(__name__) | ||
|
|
||
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_KEY = "run_detection:enginx:run_number_cycle_map" | ||
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_TTL_SECONDS = 60 * 60 |
There was a problem hiding this comment.
| ENGINX_RUN_NUMBER_CYCLE_MAP_CACHE_TTL_SECONDS = 60 * 60 | |
| ENGINX_CACHE_TTL_SECONDS = 60 * 60 |
| return state.client | ||
|
|
||
|
|
||
| def _disable_cache(exc: Exception) -> None: |
There was a problem hiding this comment.
Can you explain what the point of this function is? Are we expecting a world in which this operates without a valkey cache?If so would it not make more sense to have a very simple environment variable with on or off then everytime we go to cache check this global variable and do it if we have it else don't? or check if the valkey env vars are set at all and if not fail.
This "silent" failing is actually the opposite of the behavior we would want, as it bypasses all our observability and alerting structure.
| disabled: bool = False | ||
|
|
||
|
|
||
| class _Closable(Protocol): |
There was a problem hiding this comment.
Can you explain why we are using Protocol here? It seems excessive for our needs, and adds a concept we haven't used elsewhere within the project.
Introduce VALKEY_CACHE_ENABLED env var to allow opting out of the Valkey cache and treat several literal values as disabled. Refactor Valkey client lifecycle: add _valkey_cache_enabled check, rename _disable_cache to _disable_valkey_cache, simplify client teardown, and switch some exception logging from exception() to warning() while preserving graceful degradation on Valkey errors. Improve JSON serialization/logging paths in cache_get_json/cache_set_json. Refactor EnginX journal caching: add in-process EnginxCacheEntry, break out _get_cached/_cache functions, handle malformed ENGINX entry names robustly, prefer local in-process cache before Valkey, and adjust related tests to assert new behaviors and logging.
It's because my editor's global formatter is modifying the docstrings. Is there a repo level formatter config? |
|
Closing and moving to #516. |


Closes #509.
Description
Adds a shared Valkey cache helper to run-detection. The helper handles lazy client creation,
VALKEY_URLconfiguration, JSON serialization, TTL-based writes, and graceful fallback when Valkey is unavailable.The EnginX journal lookup now uses this cache to store the derived run-number-to-cycle map, keyed as
run_detection:enginx:run_number_cycle_map. On startup or lookup, run-detection first tries to read the cached map; if it is missing or Valkey is unavailable, it parses the journal XML files and writes the derived map back to Valkey with a TTL.Local testing
Start Valkey:
Run from repo root:
Expected output: