Skip to content

Replace alevinqc with qcatch for simpleaf QC#520

Open
an-altosian wants to merge 9 commits intonf-core:devfrom
an-altosian:dev
Open

Replace alevinqc with qcatch for simpleaf QC#520
an-altosian wants to merge 9 commits intonf-core:devfrom
an-altosian:dev

Conversation

@an-altosian
Copy link
Contributor

  • Update simpleaf modules to 0.19.5
  • Add qcatch module from local modules repo
  • Add qcatch chemistry mappings to protocols.json
  • Update simpleaf subworkflow to use QCATCH
  • Add skip_qcatch parameter for optional QC
  • Remove alevinqc module and bin/alevin_qc.r
  • Update test config and snapshots

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/scrnaseq branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

- Update simpleaf modules to 0.19.5
- Add qcatch module from local modules repo
- Add qcatch chemistry mappings to protocols.json
- Update simpleaf subworkflow to use QCATCH
- Add skip_qcatch parameter for optional QC
- Remove alevinqc module and bin/alevin_qc.r
- Update test config and snapshots

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian
Copy link
Contributor Author

an-altosian commented Feb 14, 2026

I am almost done with this PR, just need to grab a dataset to test. This can be a breaking change because I replaced alevinqc with qcatch

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

The qcatch module is not yet in the official nf-core modules repo,
so having it under modules/nf-core/ with a placeholder git_sha caused
the nf-core linter to crash with IndexError. Moving it to modules/local/
avoids the version check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@DongzeHE DongzeHE requested review from fmalmeida, grst and nictru and removed request for fmalmeida and grst February 14, 2026 01:03
@fmalmeida
Copy link
Contributor

fmalmeida commented Feb 14, 2026

Hi there,

Thanks for the PR.

Do we have an issue that was tracking the request?

And the testing dataset can be the same that was being used for alevin, no?

Copy link
Contributor

@fmalmeida fmalmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding some ideas :)

prefix = task.ext.prefix ?: "${meta.id}"

"""
export MPLCONFIGDIR=./tmp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These env variables maybe can be set directly in the nextflow.config as env there so all python modules already use it by default.

Instead of adding in each, no?

@an-altosian
Copy link
Contributor Author

an-altosian commented Feb 14, 2026

Thank you @fmalmeida! I will address your comments. BTW, I also mde a PR for nf-core qcatch module nf-core/modules#10032 . I will update there as well.

For the test dataset, as qcatch does cell calling internally (we reverse engineered cellranger's cell calling algorithm) and exports a filtered count matrix, it only works on real datasets. I tried to use the test dataset we have, but it gave me error saying insufficient data for cell calling. Therefore, I end up with doing the same thing we did for cellblender.

In the nf-core/module, I have tests using real data we used in the qcatch github repo. As we will pull the module directly from nf-core/modules, I think we are good even if we don't have tests for qcatch here.

… config

- Replace modules/local/qcatch.nf with modules/nf-core/qcatch
- Move env vars (MPLCONFIGDIR, NUMBA, etc.) to nextflow.config
- Update simpleaf subworkflow import path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian
Copy link
Contributor Author

All comments resolved. I will find a dataset to test.

an-altosian and others added 2 commits February 14, 2026 15:42
The global env vars (TMPDIR, MPLCONFIGDIR, etc.) added for qcatch were
breaking other processes: piscem crashed with SIGABRT due to TMPDIR=./tmp,
and multiqc plots failed to generate due to MPLCONFIGDIR=./tmp.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian
Copy link
Contributor Author

The failing test seems to be related to STAR.

@an-altosian an-altosian marked this pull request as ready for review February 14, 2026 22:31
@an-altosian
Copy link
Contributor Author

Tested. Ready to merge

@fmalmeida
Copy link
Contributor

Thank you @fmalmeida! I will address your comments. BTW, I also mde a PR for nf-core qcatch module nf-core/modules#10032 . I will update there as well.

For the test dataset, as qcatch does cell calling internally (we reverse engineered cellranger's cell calling algorithm) and exports a filtered count matrix, it only works on real datasets. I tried to use the test dataset we have, but it gave me error saying insufficient data for cell calling. Therefore, I end up with doing the same thing we did for cellblender.

In the nf-core/module, I have tests using real data we used in the qcatch github repo. As we will pull the module directly from nf-core/modules, I think we are good even if we don't have tests for qcatch here.

Maybe we should focus on merging the nf-core/modules first, because there we will have automated testing for the module. So we then bring it here to this PR ... this would help with the testings, since the pipeline small test cannot be used.

Any other comments @grst and @nictru ?

- Update qcatch module to 0.2.10 (pip-based install with scikit-image)
- Add remove_doublets pipeline parameter for Scrublet doublet detection
- Replace beforeScript with global env vars in nextflow.config
- Register qcatch as nf-core module in modules.json
- Update test snapshots for deterministic 0.2.10 outputs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@DongzeHE
Copy link
Member

DongzeHE commented Feb 21, 2026

I want to discuss one thing: qcatch provides emptydrop and doublet removal. Therefore, cellbench is not needed when running simpleaf mode. How to we want to expose it? BTW, the qcatch report is pretty cool. Check this demo out https://combine-lab.github.io/QCatch/demo/demo.html

…itespace

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fmalmeida
Copy link
Contributor

fmalmeida commented Feb 25, 2026

When you say provide, is that an option or something that is always done?

If the latter then yeah, we would end up having to remove cellbender from alevin runs.

If it is optional, then we can make the PR simpler and split the work where here we just bring qcatch and keep cellbender as the key module for this, opening a new issue to have this bigger discussion around how to best handle it.

@grst any takes?

@an-altosian
Copy link
Contributor Author

an-altosian commented Feb 25, 2026

The empty droplet removal is mandatory (thanks to our reviewer No.2 ;) ) and the doublet correction is an optional. I added a parameter in my PR called remove_doublets to turn this on.

@grst
Copy link
Member

grst commented Feb 25, 2026

cellbender is anyway optional and disabled by default.
If possible, I would still like to give users the option to use cellbender with simpleaf, in case they believe it's superior for whatever reason.

btw, is QCatch specific to simpleaf of would that also work with the other aligners in the pipeline?

@an-altosian
Copy link
Contributor Author

It is designed for only simpleaf and alevin-fry currently. Extending the support to other tools are our future plans.

The updated qcatch container (with scikit-image) produces different
h5ad/seurat checksums for Sample_X due to changed package metadata.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants