fix(masking): mask at handler level instead of repointing streams#17692
Draft
kyungsoo-datahub wants to merge 6 commits into
Draft
fix(masking): mask at handler level instead of repointing streams#17692kyungsoo-datahub wants to merge 6 commits into
kyungsoo-datahub wants to merge 6 commits into
Conversation
Repointing handler streams at the wrapped sys.stderr deadlocked logging under celery, whose stderr proxy re-enters the logging system, silently dropping all output. A filter on the root logger also never saw child logger records, so they went out unmasked. Attach the masking filter to existing handlers (masking records in place without touching streams), drop the stream-repointing, and remove the filter from handlers symmetrically on teardown.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
…eration Repeat install now re-attaches the filter to handlers added since the first install (masking is fail-open, so a missed handler leaks). Iterate loggerDict and handler lists defensively for multi-threaded workers, and document that masking covers handlers present at install time. Add tests for repeat-install coverage and teardown symmetry.
…eakage Masking is process-global (a filter on every handler + a singleton registry + a bootstrap flag). Tests that install it (ingest CLI, initialize_secret_masking, SecretStr config validation) don't all tear it down, so a later test's captured logs got masked (e.g. 'test_view.' -> '***REDACTED***'). Autouse fixture calls shutdown_secret_masking + resets the registry after each test.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In datahub-executor (celery worker mode), enabling secret masking silenced all logs from the worker process. After an ingestion with secrets, celery/root logs stopped appearing — the ingestion subprocess kept logging, which was misleading.
Root causes
install_masking_filter()repointed the stdout/stderr handler streams at the wrappedsys.stderr(via_update_existing_handlers). Under celery,sys.stderris a proxy that re-enters logging, so the handler looped until aRecursionErrorthat the wrapper silently swallowed — every line dropped. Teardown restoredsys.stderrbut not the handlers, so it stayed broken.Fix
_update_existing_handlers()— no stream repointing, so no celery recursion.sys.stdout/sys.stderrwrapper only for rawprint()output.datahub.masking.*loggers (they bypass masking by design).