scs
is a program that allows for the selection of spatially annotated points within a dataset using arbitrary bounding boxes.
given a dataset as input, once selection is complete, it will generate as output a TSV file with two columns, cell
and within
,
which specify the relevant cell identifier and if that cell is within the defined bounding boxe(s).
to compile this project, you will need to install a rust compiler, instructions for which can be found here
pressing h
will bring up a small help menu explaining controls.
takes two arguments: the input directory path (mandatory) and an output path (optional, will default to standard out if no path is provided)
note: it is heavily recommended to compile scs
on the release profile to achieve consistent performance.
example:
cargo run --release 'input/public' 'output.tsv'
a jupyter notebook (scripts/demo.ipynb
) is provided to demonstrate how to use scs
within the context of a scanpy-driven analysis
scs
expects the input directory to have the following structure:
- a top-level
cells.tsv
file with three columns:cell
(specifying a cell identifier),x
(specifying the cell's x coordinate), andy
(specifying the cell's y coordinate) - a directory called
feat
, which contains feature files- each feature file should have the name
{FEATURE_NAME}.tsv
(e.g.Yap1.tsv
for theYap1
feature) - each feature file should have two columns:
cell
(specifying a cell identifier) andexpr
(specifying the associated value of that feature for that cell) - any cell not present in a feature file is assumed to have a value of
0
for that feature (given feature data is often very sparse this can keep feature file size small)
- each feature file should have the name
- all input TSV files should contain a header line
the scripts
directory contains scripts that can generate input directories for scs
using common existing data formats (such as seurat)
generates an input directory from a seurat object saved as an RDS file.
requires the Seurat
, data.table
, purrr
packages to be installed.
arguments:
- input RDS file path
- desired output directory
- centroid FOV to use for x/y coordinates (optional, defers to seurat's defaults)
- assay to use for features (optional, defers to seurat's defaults)
- layer to use from selected assay for features (optional, defers to seurat's defaults)
example:
Rscript 'scripts/seurat.r' 'input/so.rds' 'input/data' # uses seurat's default values for FOV, assay, and layer
Rscript 'scripts/seurat.r' 'input/so.rds' 'input/data' 'centroids.specific' 'vizgen' 'data' # specific values
generates an input directory from an anndata object saved as an h5ad file.
note: expects spatial coordinates to be in an obsm
slot, not as a set of obs
columns.
requires the anndata
and pandas
packages to be installed.
arguments:
- input h5ad file path
- desired output directory
obsm
index to use for spatial coordinates (optional, defaults tospatial
)- layer index to use (optional, defaults to using
X
matrix)
example:
python 'scripts/from_anndata.py' 'input/adata.h5ad' 'input/data' # uses default values
python 'scripts/from_anndata.py' 'input/adata.h5ad' 'input/data' 'spatial_centroids' 'counts' # specific values