Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sc.pp.pca doesn't work after importing sparrow #209

Open
csangara opened this issue Dec 5, 2024 · 1 comment
Open

sc.pp.pca doesn't work after importing sparrow #209

csangara opened this issue Dec 5, 2024 · 1 comment

Comments

@csangara
Copy link
Member

csangara commented Dec 5, 2024

Hi guys,

I have a strange issue that I'm wondering whether you could provide some insight on. I provide a minimum example below with the PBMC dataset, but I have the same issue with any AnnData object.

import scanpy as sc
adata = sc.read_10x_mtx(
    "data/filtered_gene_bc_matrices/hg19/",  # the directory with the `.mtx` file
    var_names="gene_symbols",
    cache=True
)
sc.pp.normalize_total(adata, target_sum=1e4)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata)
sc.pp.scale(adata)
sc.pp.pca(adata)

So, the code above would just instantly run PCA:

computing PCA
    with n_comps=50
    finished (0:00:00)

but if I do this:

import sparrow
sc.pp.pca(adata)

The PCA would get stuck at

computing PCA
    with n_comps=50

without any error messages. I've left it for an hour before and it just keeps running.

When I used the debugger, it seems to be stuck at the svds function, but if I "Step Into" the function, the rest of the code runs normally.

My packages are scanpy==1.10.3 anndata==0.10.9 umap==0.5.6 numpy==1.26.4 scipy==1.14.1 pandas==2.2.3 scikit-learn==1.5.2 statsmodels==0.14.4 igraph==0.11.6 pynndescent==0.5.13

However, with a fresh install of Harpy, I don't have this issue anymore. (scanpy==1.10.4 anndata==0.11.1 umap==0.5.7 numpy==1.26.4 scipy==1.12.0 pandas==2.2.3 scikit-learn==1.5.2 statsmodels==0.14.4 igraph==0.11.8 pynndescent==0.5.13)

This is probably not very reproducible, but this has plagued me for a day, so I'm wondering if you have any ideas on what the cause could be.

Thanks!
Sai

@ArneDefauw
Copy link
Collaborator

I could reproduce somewhat similar behaviour, but unrelated to possible imports of sparrow.

For me it looked to be related to the scipy version installed. When I install scipy version 1.12.0, sc.pp.pca took approx 1min minute to run, while for scipy version 1.14.0, it took less than a second.

But strange enough, for you scipy version 1.12.0 was faster.

Also, solving the environment.yml of harpy results in installation of scipy version 1.12.0, but I could not find a library in the environment that pins scipy to this version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants