Releases: biaslab/Publication_Spectral_Subtraction
A Probabilistic Generative Model for Spectral Speech Enhancement Update
v1.1.1 — IEEE OJ-SP reviewer revision
Implements the reviewer asks from the IEEE Open Journal of Signal Processing review cycle on A
Probabilistic Generative Model for Spectral Speech Enhancement (Hidalgo-Araya et al.).
Highlights
- Composite metrics. Hu & Loizou (2008)
CSIG/CBAK/COVLare now computed alongside DNSMOS
P.835 on every evaluation. Enable with--composite. DNSMOS columns are renamedSIG→DSIG,
BAK→DBAK,OVRL→DOVRLso the two metric families stay visually parallel in every CSV and table. - WFB ablation. New
configurations/SEMHearingAid_uFB/isolates the
warped filter bank's contribution by settingapcoefficient = 0.0. All paper tables now include the
ablation row. - One-command reproduction for reviewers:
julia --project=. scripts/run_paper_results.jlruns the
full pipeline (resample → WFB preprocess → evaluate all four configurations with--composite --checkpoint-interval 50→ regenerate benchmark tables). Idempotent and resume-safe.
New scripts
| Script | Output |
|---|---|
scripts/run_paper_results.jl |
Full end-to-end orchestrator |
scripts/generate_latex_tables.jl |
tables/tab_comparison_with_params.tex, tab_wfb_ablation.tex, |
tab_metrics_quadrants.tex, tab_per_env_delta.tex |
|
scripts/plot_parameter_evolution.jl |
figures/parameter_evolution_bus_7p5dB_band13.{png,pdf,jpeg} |
(posteriors for s, n, ξ, w̃ with ±1σ ribbons) |
|
scripts/generate_results_md.jl |
RESULTS.md — browser-readable summary with figure + Tables 3–6 |
Infrastructure
microsoft/DNS-Challengeis now a pinned shallow submodule atpython_modules/DNSMOS/(git clone --recursiveis all reviewers need).install_python_deps.pyadditionally installspysepm(composite metric backend).Metrics.jlprependspython_modules/tosys.pathat init socomposite_wrapperand
dnsmos_wrapperimport on a fresh clone regardless of pip-install state.run_evaluation.jltruncates reference/processed signal pairs to the shorter length, avoiding
PESQ/CSIG/CBAK/COVL crashes on the 1–16 samples SEM synthesis occasionally emits past the WFB boundary..gitignorecatchesdatabases/**/clean_testset_wavandnoisy_testset_wavsymlinks plus Python
__pycache__directories.- README reorganised with a top-level "Reviewers: reproduce every number in the paper with one command"
callout and a Hu & Loizou 2008 citation.
Reproduction parity
The revised pipeline reproduces the historical lock runs (run_03_12_2025_*) to 4 decimal places on PESQ
and within ~0.03 on DNSMOS (ONNX-runtime jitter across CPU permutations; no modelling change). Runs
without --composite produce CSVs that are column-compatible with the prior release once the DNSMOS
rename is applied.
Breaking change
DNSMOS column headers in per-file and summary CSVs changed from SIG / BAK / OVRL to DSIG / DBAK / DOVRL. If you have downstream tooling that reads pre-v1.1.1 CSVs by column name, either rename at the
reader or re-run the evaluations (the numerical values are unchanged).
Diff summary
- 12 commits, all prefixed
"reviewers' feedback:" - 11 scripts / files added, 0 removed
- Branch:
reviewers-feedback→main
A Probabilistic Generative Model for Spectral Speech Enhancement Update
Re release to obtain DOI for ZENODO
A Probabilistic Generative Model for Spectral Speech Enhancement
Release version of the publication "A Probabilistic Generative Model for Spectral Speech Enhancement". Update release