Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples and api documents #19

Open
pcasl opened this issue Jan 22, 2025 · 5 comments
Open

examples and api documents #19

pcasl opened this issue Jan 22, 2025 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@pcasl
Copy link

pcasl commented Jan 22, 2025

It is great to find that SPM has a python version!

I am trying to move previous scripts from Matlab to python. But I not sure the api and how to use it in the python. For example, in Matlab, I can read dicom in the following script:

spm('defaults','fmri');
spm_jobman('initcfg');
matlabbatch{1}.spm.util.import.dicom.data = spm_input_dir(input);
matlabbatch{1}.spm.util.import.dicom.root = 'flat';
matlabbatch{1}.spm.util.import.dicom.outdir = {output};
matlabbatch{1}.spm.util.import.dicom.protfilter = '.*';
matlabbatch{1}.spm.util.import.dicom.convopts.format = 'nii';
matlabbatch{1}.spm.util.import.dicom.convopts.meta = 0;
matlabbatch{1}.spm.util.import.dicom.convopts.icedims = 0;
spm_jobman('run',matlabbatch);

But I am not sure which function to call and which parameters to pass in python. It will be great if you can give more examples, such as how to import dicom, registration and segmentations and so on.

Thank you !

@tierneytim
Copy link
Collaborator

Hi @pcasl, thanks for the interest.
Right now we have only documentation for individual function calls as the batch scripts are a little trickier to translate. I believe the functionality is there but it will just take a bit more time to be fully documented to cover the many edge cases that arise. I'll discuss with team at next dev meeting and see if we can get someone assigned to this.

@tierneytim
Copy link
Collaborator

Hi @pcasl, We discussed your issue today and have assigned @johmedr and @arthurmitchell96 to help. Hopefully, we will have some documentation soon for you

@johmedr
Copy link
Collaborator

johmedr commented Jan 28, 2025

Hi @pcasl, thank you for the interest!

@arthurmitchell96 is going to build an example of using the batch in Python.

Here is some explanations on using SPM batch system in Python -- please let us know if you have questions or encounter any issue!

Note that you can open the batch GUI from Python (assuming it is run locally) using spm_jobman.

Base concepts

Just to get started, matlabbatch in Matlab a cell array of structs. Cell arrays are lists in Python, while normal arrays are numpy.array (or array.array) — but not lists. For struct array, you can create them by passing all of your Structs (e.g., s1, s2, s3) to a StructArray (e.g., StructArray(s1, s2, s3)). You can also create a StructArray of size n,m using StructArray(n,m). A difference with Matlab is that struct array have a static size (indexing outside of the side will raise an IndexError).

Some syntactic considerations

For using the batch system in Python, we yet haven't decided of the nicest/cleanest syntax. One big thing here is that Python does require to explicitly instantiate all of the intermediate indexes/structures that you would usually omit in Matlab, making a naive translation of batch statements very bloated. For instance, the already-heavy Matlab statement:

matlabbatch{1}.cfg_basicio.file_dir.dir_ops.cfg_mkdir.name = 'GLM';

would naively translate to the heavier Python statement:

matlabbatch = [
      Struct(cfg_basicio=
            Struct(file_dir=
                  Struct(dir_ops= 
                        Struct(cfg_mkdir=
                              Struct(name='GLM')
                        )
                  )
            )
      )
]

I think we need to provide the user with a cleaner way to implement this.

Practical solution ...

... for constructing a batch in Python

A workaround which prefer (as it should still work with future versions) is to directly use Matlab statements to create this variable. This alleviates all of the trouble of converting types (e.g. cellstr to Python list of string) and leaves the Python wrapper do that job for you. To do this, you first create a Python string that contain your Matlab instructions (the trick for cleanliness and brevity is to use a multiline f-string):

batchstr = f"""
% Output Directory
%--------------------------------------------------------------------------
matlabbatch{{1}}.cfg_basicio.file_dir.dir_ops.cfg_mkdir.parent = cellstr({data_path});
matlabbatch{{1}}.cfg_basicio.file_dir.dir_ops.cfg_mkdir.name = 'GLM';

% Model Specification
%--------------------------------------------------------------------------
matlabbatch{{2}}.spm.stats.fmri_spec.dir = cellstr(fullfile({data_path},'GLM'));
matlabbatch{{2}}.spm.stats.fmri_spec.timing.units = 'scans';
matlabbatch{{2}}.spm.stats.fmri_spec.timing.RT = 7;
matlabbatch{{2}}.spm.stats.fmri_spec.sess.scans = cellstr({f});
matlabbatch{{2}}.spm.stats.fmri_spec.sess.cond.name = 'active';
matlabbatch{{2}}.spm.stats.fmri_spec.sess.cond.onset = 6:12:84;
matlabbatch{{2}}.spm.stats.fmri_spec.sess.cond.duration = 6;

matlabbatch
""" 

Note that doubling up the cell array indexes is necessary to escape the curly bracket character in the f-string. The single curly brackets around data_path will format the Python variable data_path. The last line will allow returning the matlabbatch variable. Then, to get the variable in Python, you can just do:

matlabbatch = Runtime.call('eval', batchstr)

I think this syntax makes sense because we usually get the matlabbatch statements from somewhere (e.g., using the batch gui), and what we want is to fill in some of the fields. Similar to Matlab, if the matlabbatch statement contains things like dependencies (cfg_dep), you'll need to run spm_jobman('initcfg') before evaluating the batchstr to avoid an error (e.g. Unrecognized function or variable 'cfg_dep').

... for using a batch created with the GUI

You can also get your matlabbatch variable from the mat-file created when saving the batch as a variable:

matlabbatch = Runtime.call('load', 'path/to/my_batch.mat')['matlabbatch']

Or from the m-file created when saving the batch as a script:

with open('path/to/my_batch_job.m', 'r') as batch:
      batchstr = batch.read()
matlabbatch = Runtime.call('eval', batchstr + '; matlabbatch')

Or even simpler if 'my_batch_job.m' is in the Matlab path:

matlabbatch = Runtime.call('eval', 'my_batch_job; matlabbatch')

All of these syntaxes should give the same variable. You can then run the job with

spm_jobman('run', matlabbatch) 

or if there is any input

spm_jobman('run', matlabbatch, *inputs) 

At this point, what you run in Python or Matlab is up to you: you can treat your Python notebook as a Matlab script and just have a big eval at the end (or evalin if you want to declare workspace variables) or converting everything in Python -- I think it really depends on the use case.

In the end converting between both isn't really computationally expensive unless you have a very large, nested structure (e.g., bouncing an SPM.mat forth and back will not be instantaneous), but even then, it should be relatively fast compared to the actual data processing (that goes up to Matlab speed).

@balbasty
Copy link
Contributor

@johmedr I've played with alternative Struct/Cell/Array classes that could make it easier to build batches, using a syntax closer to the one we use in matlab.

Here's a notebook with the prototype. There's a batch example at the end of the notebook:

matlabbatch = CellArray()

matlabbatch[0].spm.util["import"].dicom.data = 'dir';
matlabbatch[0].spm.util["import"].dicom.root = 'flat';
matlabbatch[0].spm.util["import"].dicom.outdir = cell('output');
matlabbatch[0].spm.util["import"].dicom.protfilter = '.*';
matlabbatch[0].spm.util["import"].dicom.convopts.format = 'nii';
matlabbatch[0].spm.util["import"].dicom.convopts.meta = 0;
matlabbatch[0].spm.util["import"].dicom.convopts.icedims = 0;

The StructArray/CellArray/NumArray classes are numpy arrays that automatically resize themselves if out-of-bound elements are queried (similar to matlab's behaviour). Uninitialized elements are DelayedArrays that transform themselves into StructArray/CellArray/NumArray based on the type of indexing that's applied. There's then a hacky logic so that delayed arrays warn their parents that they have determined their type and can be "finalised".

The main issues is we don't have two different types of bracket to differentiate "cells of struct" from "struct array". I've tried to hijack __call__ to implement matlab's {} but it's very flimsy. In the meantime I've added as_cell/as_struct/as_array properties to more robustly provide type hints, so we can do something like a[1].b.as_cell[2].c = x for matlab's a(2).b{3}.c = x.

It's a prototype, I am sure they are lots of corner cases I haven't found.

It's somewhat related to this issue, but we can also discuss this in a new issue, if it sounds interesting.

Cheers
Yael

@johmedr
Copy link
Collaborator

johmedr commented Jan 30, 2025

Hi @balbasty,

Thanks for sharing this, it is amazing! I am opening a new issue (#20) to discuss integrating your type system into the repo.

Cheers,
Johan

@johmedr johmedr added enhancement New feature or request documentation Improvements or additions to documentation labels Mar 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants