Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Poisson-Gauss Mixture model for Guide Assignment #709

Merged
merged 19 commits into from
Feb 20, 2025

Conversation

stefanpeidli
Copy link
Collaborator

@stefanpeidli stefanpeidli commented Feb 11, 2025

PR Checklist

Description of changes

  • implements Poisson-Gauss Mixture Model with numpyro
  • added simulated data to doc notebooks, allows comparison of guide assignment methods
  • added appropriate tests

Technical details

  • Warning: mixture model is still slow if run on CPU CUDA/jax. There might be some improvements to efficiency that could be made here.
  • Currently, the way we hand back guide assignments is not uniform: threshold method gives a matrix in layers, max method gives a column in obs. I followed the latter with mixture model, concatenating any multi-assignments with "+". Long term we should allow both ways for all methods IMO.

stefanpeidli and others added 13 commits September 30, 2024 17:13
Key additions:
- Added a base abstract class "MixtureModel" with numpyro
- Added a first mixture model "Poisson_Gauss_Mixture"
- New function "assign_mixture_model" in GuideAssignment class
* Set legend anchor as parameter (#660)

* Fix missing space

* Remove explicit anndata in dependencies (#666)

* Incorporate use case tutorials (#665)

* Fixed DEG layer retrieval

* Use-case tutorial icons

* Restructure tutorial page

* Subgroup tutorials

* Improve KNN label_transfer in PerturbationSpace (#658)

* Add uncertainty score in KNN label_transfer in PerturbationSpace
Certainty is quantified as the fraction of nearest neighbors belonging to the classified (i.e. the most abundant) label compared to the total number of nearest neighbors.

* Update pre-commit-config.yaml
Replaces yanked dependency of mypy "types-pkg-resources" with "types-setuptools" as recommended: https://pypi.org/project/types-pkg-resources/

* Improve label imputation in PerturbationSpace class
Key changes:
- Now uses KNN graph in adata: saves cost and increases consistency
- Vectorized operations instead of expensive for loop
- Distance weighting for KNN imputation
- Quantifies uncertainty as local KNN label entropy

* Fixed plotting for mixscape.plot_barplot and sccoda.plot_effects_barplot (#667)

* Augur scsim warnings (#670)

* Augur scsim warnings

Signed-off-by: zethson <[email protected]>

* Submodules

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Add PerturbationDataValidator (#672)

* Augur scsim warnings

Signed-off-by: zethson <[email protected]>

* Submodules

Signed-off-by: zethson <[email protected]>

* Add super draft of pertpy validator

Signed-off-by: zethson <[email protected]>

* Polish

Signed-off-by: zethson <[email protected]>

* Polish

Signed-off-by: zethson <[email protected]>

* Nested try

Signed-off-by: zethson <[email protected]>

* validator in test

Signed-off-by: zethson <[email protected]>

* try uv for rtd

Signed-off-by: zethson <[email protected]>

* rtd uv

Signed-off-by: zethson <[email protected]>

* rtd uv

Signed-off-by: zethson <[email protected]>

* rtd uv fix

Signed-off-by: zethson <[email protected]>

* mb sphinx fix for validator

Signed-off-by: zethson <[email protected]>

* docs

Signed-off-by: zethson <[email protected]>

* remove PerturbationValidator from docs

Signed-off-by: zethson <[email protected]>

* remove PerturbationValidator from docs

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Latest OS for RTD

* Remove curator again

Signed-off-by: zethson <[email protected]>

* Fix jax random array (#686)

* Fix jax random array

Signed-off-by: zethson <[email protected]>

* Fix further jax warnings

Signed-off-by: zethson <[email protected]>

* Fix edger

Signed-off-by: zethson <[email protected]>

* Fix choice

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Switch to formulaic-contrasts (#682)

* Switch to formulaic-contrasts

* Cleanup

* removing design matrix workaround (#691)

Co-authored-by: Emma Dann <[email protected]>

* Fix PyDESeq2

* Update tests

* fix typo in gitignore

* Remove contrast dataclass, which isnt used anywhere

* Fix edgeR rpy2 tests (#692)

* fix broken rpy2 edger tests

* updated edger tests

* Fix tests (scipy)

Signed-off-by: zethson <[email protected]>

* submodule

Signed-off-by: zethson <[email protected]>

* Remove unused code

Signed-off-by: zethson <[email protected]>

* type hints

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: zethson <[email protected]>

* Release 0.9.5

Signed-off-by: zethson <[email protected]>

* Prepare 0.10.0

Signed-off-by: zethson <[email protected]>

* Added Mixscape seeds and test (#683)

Co-authored-by: Lukas Heumos <[email protected]>

* Fix probability data type (#696)

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize MeanVarDistributionDistance (#697)

* Fix probability data type

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize mean_var distance

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize test speed (#699)

* Try buildjet

Signed-off-by: Lukas Heumos <[email protected]>

* Try buildjet large

Signed-off-by: Lukas Heumos <[email protected]>

* speed up predict_differential_prioritization

Signed-off-by: Lukas Heumos <[email protected]>

* speed up tests

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: Lukas Heumos <[email protected]>

* Lower bound for scikit-learn (#701)

Signed-off-by: Lukas Heumos <[email protected]>

* Fix type annotation

Signed-off-by: Lukas Heumos <[email protected]>

* Fix empty figure returns when show=True in plotting functions (#703)

* Removed show parameter

* Adapt plotting API for Augur, Coda, Dialogue

* Adapted plotting API for Milo, Mixscape, scgen

* Add joblib

* Remove joblib

---------

Co-authored-by: Lukas Heumos <[email protected]>

* Fix scikit-learn intendation

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: zethson <[email protected]>
Signed-off-by: Lukas Heumos <[email protected]>
Co-authored-by: Lilly May <[email protected]>
Co-authored-by: Lukas Heumos <[email protected]>
Co-authored-by: Gregor Sturm <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
* Set legend anchor as parameter (#660)

* Fix missing space

* Remove explicit anndata in dependencies (#666)

* Incorporate use case tutorials (#665)

* Fixed DEG layer retrieval

* Use-case tutorial icons

* Restructure tutorial page

* Subgroup tutorials

* Improve KNN label_transfer in PerturbationSpace (#658)

* Add uncertainty score in KNN label_transfer in PerturbationSpace
Certainty is quantified as the fraction of nearest neighbors belonging to the classified (i.e. the most abundant) label compared to the total number of nearest neighbors.

* Update pre-commit-config.yaml
Replaces yanked dependency of mypy "types-pkg-resources" with "types-setuptools" as recommended: https://pypi.org/project/types-pkg-resources/

* Improve label imputation in PerturbationSpace class
Key changes:
- Now uses KNN graph in adata: saves cost and increases consistency
- Vectorized operations instead of expensive for loop
- Distance weighting for KNN imputation
- Quantifies uncertainty as local KNN label entropy

* Fixed plotting for mixscape.plot_barplot and sccoda.plot_effects_barplot (#667)

* Augur scsim warnings (#670)

* Augur scsim warnings

Signed-off-by: zethson <[email protected]>

* Submodules

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Add PerturbationDataValidator (#672)

* Augur scsim warnings

Signed-off-by: zethson <[email protected]>

* Submodules

Signed-off-by: zethson <[email protected]>

* Add super draft of pertpy validator

Signed-off-by: zethson <[email protected]>

* Polish

Signed-off-by: zethson <[email protected]>

* Polish

Signed-off-by: zethson <[email protected]>

* Nested try

Signed-off-by: zethson <[email protected]>

* validator in test

Signed-off-by: zethson <[email protected]>

* try uv for rtd

Signed-off-by: zethson <[email protected]>

* rtd uv

Signed-off-by: zethson <[email protected]>

* rtd uv

Signed-off-by: zethson <[email protected]>

* rtd uv fix

Signed-off-by: zethson <[email protected]>

* mb sphinx fix for validator

Signed-off-by: zethson <[email protected]>

* docs

Signed-off-by: zethson <[email protected]>

* remove PerturbationValidator from docs

Signed-off-by: zethson <[email protected]>

* remove PerturbationValidator from docs

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Latest OS for RTD

* Remove curator again

Signed-off-by: zethson <[email protected]>

* Fix jax random array (#686)

* Fix jax random array

Signed-off-by: zethson <[email protected]>

* Fix further jax warnings

Signed-off-by: zethson <[email protected]>

* Fix edger

Signed-off-by: zethson <[email protected]>

* Fix choice

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>

* Switch to formulaic-contrasts (#682)

* Switch to formulaic-contrasts

* Cleanup

* removing design matrix workaround (#691)

Co-authored-by: Emma Dann <[email protected]>

* Fix PyDESeq2

* Update tests

* fix typo in gitignore

* Remove contrast dataclass, which isnt used anywhere

* Fix edgeR rpy2 tests (#692)

* fix broken rpy2 edger tests

* updated edger tests

* Fix tests (scipy)

Signed-off-by: zethson <[email protected]>

* submodule

Signed-off-by: zethson <[email protected]>

* Remove unused code

Signed-off-by: zethson <[email protected]>

* type hints

Signed-off-by: zethson <[email protected]>

---------

Signed-off-by: zethson <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: zethson <[email protected]>

* Release 0.9.5

Signed-off-by: zethson <[email protected]>

* Prepare 0.10.0

Signed-off-by: zethson <[email protected]>

* Added Mixscape seeds and test (#683)

Co-authored-by: Lukas Heumos <[email protected]>

* Fix probability data type (#696)

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize MeanVarDistributionDistance (#697)

* Fix probability data type

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize mean_var distance

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: Lukas Heumos <[email protected]>

* Optimize test speed (#699)

* Try buildjet

Signed-off-by: Lukas Heumos <[email protected]>

* Try buildjet large

Signed-off-by: Lukas Heumos <[email protected]>

* speed up predict_differential_prioritization

Signed-off-by: Lukas Heumos <[email protected]>

* speed up tests

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: Lukas Heumos <[email protected]>

* Lower bound for scikit-learn (#701)

Signed-off-by: Lukas Heumos <[email protected]>

* Fix type annotation

Signed-off-by: Lukas Heumos <[email protected]>

* Fix empty figure returns when show=True in plotting functions (#703)

* Removed show parameter

* Adapt plotting API for Augur, Coda, Dialogue

* Adapted plotting API for Milo, Mixscape, scgen

* Add joblib

* Remove joblib

---------

Co-authored-by: Lukas Heumos <[email protected]>

* Fix scikit-learn intendation

Signed-off-by: Lukas Heumos <[email protected]>

---------

Signed-off-by: zethson <[email protected]>
Signed-off-by: Lukas Heumos <[email protected]>
Co-authored-by: Lilly May <[email protected]>
Co-authored-by: Lukas Heumos <[email protected]>
Co-authored-by: Gregor Sturm <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
Co-authored-by: Emma Dann <[email protected]>
@stefanpeidli stefanpeidli added the enhancement New feature or request label Feb 11, 2025
@stefanpeidli stefanpeidli requested a review from Zethson February 11, 2025 11:01
@stefanpeidli stefanpeidli self-assigned this Feb 11, 2025
@github-actions github-actions bot added the chore label Feb 11, 2025
Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very very much!

  1. Could you please make an according PR to https://github.com/scverse/pertpy-tutorials for your submdule update?
  2. Please ensure that you have typehints and return types everywhere.

This is great work!

@Zethson
Copy link
Member

Zethson commented Feb 19, 2025

@stefanpeidli could you please also add a comparison notebook that verifies your implementation against maybe that of crispat to https://github.com/theislab/pertpy-reproducibility/tree/main/benchmark ?
We need it if we want to justify that our implementation works the same way.

stefanpeidli and others added 2 commits February 20, 2025 14:41
Review comments by @Zethson

Co-authored-by: Lukas Heumos <[email protected]>
- Added lots of type hints and return types
- Improved naming of variables
- Added and removed a few comments
- Added user warnings if a guide is not expressed at all
@stefanpeidli
Copy link
Collaborator Author

@Zethson I think I need appropriate rights to create a new feature branch for this in pertpy-reproducibility.

Previously data was (N,1) dim. Now applying ravel, and changed numpyro plates accordingly for correct batching.
@stefanpeidli
Copy link
Collaborator Author

See associated PR for notebook changes: scverse/pertpy-tutorials#50

We changed "Negative" to "negative" :)
@stefanpeidli stefanpeidli requested a review from Zethson February 20, 2025 16:40
@Zethson Zethson changed the title Implement new guide assignment method (Issue: #657) Add Poisson-Gauss Mixture model for Guide Assignment Feb 20, 2025
Signed-off-by: Lukas Heumos <[email protected]>
Signed-off-by: Lukas Heumos <[email protected]>
@Zethson Zethson merged commit b94443b into main Feb 20, 2025
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants