fix: prevent NaN from division by zero in TFT interpretation + fix headless test backend#2207
Open
haoyu-haoyu wants to merge 1 commit intosktime:mainfrom
Open
Conversation
…adless test backend
Two fixes:
1. TFT interpretation functions divide attention/variable-importance
tensors by their sum/max without guarding against zero denominators.
When all attention values are below 1e-5 (set to NaN at line 799),
the downstream sums become zero, producing NaN that propagates
through all interpretation statistics.
Add .clamp(min=1e-8) to 5 denominator expressions:
- line 852: masked attention renormalization
- line 942: attention weights in plot_interpretation
- line 959: variable importance in make_selection_plot
- line 1009: attention_occurrences normalization
- line 1025: final attention sum normalization
2. Tests crash on Windows/headless CI with TclError because matplotlib
defaults to TkAgg backend. Add matplotlib.use("Agg") to conftest.py
so all test-generated plots use the non-interactive backend.
Fixes sktime#2140
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7fd5d0e to
1923915
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two independent fixes bundled together:
1. TFT Division by Zero → NaN Propagation
The TFT interpretation methods divide attention/variable-importance tensors by their sum or max. When attention values fall below
1e-5(deliberately set toNaNat line 799), downstream sums become zero, causing0/0 = NaNthat silently corrupts all interpretation statistics.5 locations fixed by adding
.clamp(min=1e-8)to the denominator:masked_op(attention, ..., op="sum")attention.sum(-1)plot_interpretationvalues.sum(-1)make_selection_plotattention_occurrences.max()interpretation["attention"].sum()2. Matplotlib Backend for Headless Testing (fixes #2140)
Tests calling
model.plot_prediction()crash on Windows/headless CI becauseplt.subplots()defaults to TkAgg backend, which requires Tkinter.Fix: Add
matplotlib.use("Agg")at the top oftests/conftest.py.Test plan
pytest tests/ -x -qpasses on Linux and Windowsmodel.plot_prediction()works in headless environmentsFixes #2140