Replace alevinqc with qcatch for simpleaf QC#520
Replace alevinqc with qcatch for simpleaf QC#520an-altosian wants to merge 9 commits intonf-core:devfrom
Conversation
- Update simpleaf modules to 0.19.5 - Add qcatch module from local modules repo - Add qcatch chemistry mappings to protocols.json - Update simpleaf subworkflow to use QCATCH - Add skip_qcatch parameter for optional QC - Remove alevinqc module and bin/alevin_qc.r - Update test config and snapshots Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I am almost done with this PR, just need to grab a dataset to test. This can be a breaking change because I replaced alevinqc with qcatch |
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.5.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
The qcatch module is not yet in the official nf-core modules repo, so having it under modules/nf-core/ with a placeholder git_sha caused the nf-core linter to crash with IndexError. Moving it to modules/local/ avoids the version check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Hi there, Thanks for the PR. Do we have an issue that was tracking the request? And the testing dataset can be the same that was being used for alevin, no? |
modules/local/qcatch.nf
Outdated
| prefix = task.ext.prefix ?: "${meta.id}" | ||
|
|
||
| """ | ||
| export MPLCONFIGDIR=./tmp |
There was a problem hiding this comment.
These env variables maybe can be set directly in the nextflow.config as env there so all python modules already use it by default.
Instead of adding in each, no?
|
Thank you @fmalmeida! I will address your comments. BTW, I also mde a PR for nf-core qcatch module nf-core/modules#10032 . I will update there as well. For the test dataset, as qcatch does cell calling internally (we reverse engineered cellranger's cell calling algorithm) and exports a filtered count matrix, it only works on real datasets. I tried to use the test dataset we have, but it gave me error saying insufficient data for cell calling. Therefore, I end up with doing the same thing we did for cellblender. In the nf-core/module, I have tests using real data we used in the qcatch github repo. As we will pull the module directly from nf-core/modules, I think we are good even if we don't have tests for qcatch here. |
… config - Replace modules/local/qcatch.nf with modules/nf-core/qcatch - Move env vars (MPLCONFIGDIR, NUMBA, etc.) to nextflow.config - Update simpleaf subworkflow import path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
All comments resolved. I will find a dataset to test. |
The global env vars (TMPDIR, MPLCONFIGDIR, etc.) added for qcatch were breaking other processes: piscem crashed with SIGABRT due to TMPDIR=./tmp, and multiqc plots failed to generate due to MPLCONFIGDIR=./tmp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
The failing test seems to be related to STAR. |
|
Tested. Ready to merge |
Maybe we should focus on merging the |
- Update qcatch module to 0.2.10 (pip-based install with scikit-image) - Add remove_doublets pipeline parameter for Scrublet doublet detection - Replace beforeScript with global env vars in nextflow.config - Register qcatch as nf-core module in modules.json - Update test snapshots for deterministic 0.2.10 outputs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I want to discuss one thing: qcatch provides emptydrop and doublet removal. Therefore, cellbench is not needed when running simpleaf mode. How to we want to expose it? BTW, the qcatch report is pretty cool. Check this demo out https://combine-lab.github.io/QCatch/demo/demo.html |
…itespace Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
When you say provide, is that an option or something that is always done? If the latter then yeah, we would end up having to remove cellbender from alevin runs. If it is optional, then we can make the PR simpler and split the work where here we just bring qcatch and keep cellbender as the key module for this, opening a new issue to have this bigger discussion around how to best handle it. @grst any takes? |
|
The empty droplet removal is mandatory (thanks to our reviewer No.2 ;) ) and the doublet correction is an optional. I added a parameter in my PR called |
|
cellbender is anyway optional and disabled by default. btw, is QCatch specific to simpleaf of would that also work with the other aligners in the pipeline? |
|
It is designed for only simpleaf and alevin-fry currently. Extending the support to other tools are our future plans. |
The updated qcatch container (with scikit-image) produces different h5ad/seurat checksums for Sample_X due to changed package metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).