Skip to content

Releases: biaslab/Publication_Spectral_Subtraction

A Probabilistic Generative Model for Spectral Speech Enhancement Update

23 Apr 21:26

Choose a tag to compare

v1.1.1 — IEEE OJ-SP reviewer revision

Implements the reviewer asks from the IEEE Open Journal of Signal Processing review cycle on A
Probabilistic Generative Model for Spectral Speech Enhancement
(Hidalgo-Araya et al.).

Highlights

  • Composite metrics. Hu & Loizou (2008) CSIG / CBAK / COVL are now computed alongside DNSMOS
    P.835 on every evaluation. Enable with --composite. DNSMOS columns are renamed SIGDSIG,
    BAKDBAK, OVRLDOVRL so the two metric families stay visually parallel in every CSV and table.
  • WFB ablation. New configurations/SEMHearingAid_uFB/ isolates the
    warped filter bank's contribution by setting apcoefficient = 0.0. All paper tables now include the
    ablation row.
  • One-command reproduction for reviewers: julia --project=. scripts/run_paper_results.jl runs the
    full pipeline (resample → WFB preprocess → evaluate all four configurations with --composite --checkpoint-interval 50 → regenerate benchmark tables). Idempotent and resume-safe.

New scripts

Script Output
scripts/run_paper_results.jl Full end-to-end orchestrator
scripts/generate_latex_tables.jl tables/tab_comparison_with_params.tex, tab_wfb_ablation.tex,
tab_metrics_quadrants.tex, tab_per_env_delta.tex
scripts/plot_parameter_evolution.jl figures/parameter_evolution_bus_7p5dB_band13.{png,pdf,jpeg}
(posteriors for s, n, ξ, with ±1σ ribbons)
scripts/generate_results_md.jl RESULTS.md — browser-readable summary with figure + Tables 3–6

Infrastructure

  • microsoft/DNS-Challenge is now a pinned shallow submodule at python_modules/DNSMOS/ (git clone --recursive is all reviewers need).
  • install_python_deps.py additionally installs pysepm (composite metric backend).
  • Metrics.jl prepends python_modules/ to sys.path at init so composite_wrapper and
    dnsmos_wrapper import on a fresh clone regardless of pip-install state.
  • run_evaluation.jl truncates reference/processed signal pairs to the shorter length, avoiding
    PESQ/CSIG/CBAK/COVL crashes on the 1–16 samples SEM synthesis occasionally emits past the WFB boundary.
  • .gitignore catches databases/**/clean_testset_wav and noisy_testset_wav symlinks plus Python
    __pycache__ directories.
  • README reorganised with a top-level "Reviewers: reproduce every number in the paper with one command"
    callout and a Hu & Loizou 2008 citation.

Reproduction parity

The revised pipeline reproduces the historical lock runs (run_03_12_2025_*) to 4 decimal places on PESQ
and within ~0.03 on DNSMOS (ONNX-runtime jitter across CPU permutations; no modelling change). Runs
without --composite produce CSVs that are column-compatible with the prior release once the DNSMOS
rename is applied.

Breaking change

DNSMOS column headers in per-file and summary CSVs changed from SIG / BAK / OVRL to DSIG / DBAK / DOVRL. If you have downstream tooling that reads pre-v1.1.1 CSVs by column name, either rename at the
reader or re-run the evaluations (the numerical values are unchanged).

Diff summary

  • 12 commits, all prefixed "reviewers' feedback:"
  • 11 scripts / files added, 0 removed
  • Branch: reviewers-feedbackmain

A Probabilistic Generative Model for Spectral Speech Enhancement Update

28 Feb 01:00

Choose a tag to compare

A Probabilistic Generative Model for Spectral Speech Enhancement

03 Dec 12:57

Choose a tag to compare

Release version of the publication "A Probabilistic Generative Model for Spectral Speech Enhancement". Update release