Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image based QC prior to aggregate_profiles #215

Open
kvshams opened this issue Jul 20, 2022 · 4 comments
Open

Image based QC prior to aggregate_profiles #215

kvshams opened this issue Jul 20, 2022 · 4 comments

Comments

@kvshams
Copy link

kvshams commented Jul 20, 2022

Are there any QC procedure that could be done prior to aggregating well.

In my case any images that have less cell dense region would create an artifact as the cells become larger and larger. I want to avoid those images from the aggregation steps.

Or is there any way to get the entire db to be covert to one data frame including all features and metadata?. This would be more usable for the QC and exclude identified outliers from db and perform the downstream aggregation and analysis?.

@gwaybio
Copy link
Member

gwaybio commented Jul 20, 2022

Pycytominer doesn't perform any QC at the moment.

You might consider looking into bioprofiling.jl. IIRC they have some QC ability.

You can also look into this paper, which proposes some QC ideas (not yet implemented in pycytominer, see rohban-lab/Image-based-cell-profiling-enhancement-via-data-cleaning-methods#1), including one which may be helpful for adjusting for cell density.

Pycytominer does have functionality to acquire full db (SQLite) here:

class SingleCells(object):

@kvshams
Copy link
Author

kvshams commented Jul 21, 2022

@gwaybio Thank you for point out to the insightful method. It is a naive request. How to create the data frame of single cell df after loading the db.
sc = SingleCells('sqlite:///Data/database.sqlite') # this is by default get the strata=['Metadata_Plate', 'Metadata_Well']
ie, How I can create a data-frame contains raw single cell level data for the qc, from sqlite output created by the ingest (used ingest function to combine parallel processed data)

@gwaybio
Copy link
Member

gwaybio commented Jul 21, 2022

We have a function inside the SingleCells class to merge single cells. See https://pycytominer.readthedocs.io/en/latest/pycytominer.cyto_utils.html#pycytominer.cyto_utils.cells.SingleCells.merge_single_cells

However, we recently recognized some memory issues in this function (see #195), which we're working to solve by moving away from SQLite to parquet (#213).

We'd welcome any insights and experience you have with this method

@gwaybio
Copy link
Member

gwaybio commented Aug 17, 2022

@kvshams - I wanted to provide an update that the merge_single_cells() functionality is now working well. It now takes 15 minutes to merge whereas previously it was taking several hours.

This might help you to design methods for image QC prior to aggregating. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants