Handle automatic chunks duration for SC2 #3721

yger · 2025-02-25T09:45:15Z

When using large number of cores, we might want to automatically set the chunk_size as function of the available RAM and/or number of cores. While I know that currently, in SI, the preprocessing nodes can consumes more RAM than chunk_size, at least this will avoid swapping and automatically adjust the sizes

for more information, see https://pre-commit.ci

…tal_memory

for more information, see https://pre-commit.ci

…tal_memory

for more information, see https://pre-commit.ci

…tal_memory

zm711

Couple more cosmetic questions.

zm711 · 2025-03-02T15:28:13Z

src/spikeinterface/sortingcomponents/tools.py

+        n_jobs = int(min(n_jobs, memory_usage // ram_requested))
+        job_kwargs.update(dict(n_jobs=n_jobs))
+    else:
+        print("psutil is required to use only a fraction of available memory")


are these prints or warnings? in general.

I'll use the warning package instead

zm711 · 2025-03-02T15:28:23Z

src/spikeinterface/sortingcomponents/tools.py

+                if recording.get_total_memory_size() < memory_usage:
+                    recording = recording.save_to_memory(format="memory", shared=True, **job_kwargs)
+                else:
+                    print("Recording too large to be preloaded in RAM...")


same for these prints.

src/spikeinterface/sortingcomponents/tools.py

zm711 · 2025-03-02T15:32:42Z

src/spikeinterface/sorters/internal/spyking_circus2.py

@@ -73,6 +75,8 @@ class Spykingcircus2Sorter(ComponentsBasedSorter):
        "matched_filtering": "Boolean to specify whether circus 2 should detect peaks via matched filtering (slightly slower)",
        "cache_preprocessing": "How to cache the preprocessed recording. Mode can be memory, file, zarr, with extra arguments. In case of memory (default), \
                         memory_limit will control how much RAM can be used. In case of folder or zarr, delete_cache controls if cache is cleaned after sorting",
+        "chunk_preprocessing": "How much RAM (approximately) should be devoted to load data chunks. memory_limit will control how much RAM can be used\


I think you need to say what units of RAM this will work with. I'm confused by this with the way it is currently written. I think users will need a bit more info to really understand how to use this.

for more information, see https://pre-commit.ci

…tal_memory

for more information, see https://pre-commit.ci

…tal_memory

for more information, see https://pre-commit.ci

…tal_memory

for more information, see https://pre-commit.ci

…total_memory

…tal_memory

for more information, see https://pre-commit.ci

… into total_memory

b-grimaud · 2025-04-07T09:52:39Z

With a few modifications, it can process a minute long recording in 8h45.

In the main sorting function, it would often crash at estimate_templates. I added a call to get_optimal_n_jobs with the same ram_requested as in clustering/circus.py.
In cases where the requested RAM is high and/or the memory limit is low, int will round n_jobs down to 0, which will return an error. I think get_optimal_n_jobs should ensure that n_jobs is at least 1, or at least return a specific warning.
In clustering/circus.py, I also had crashes at remove_duplicates_via_matching. Passing the same job_kwargs as those used in estimate_templates does the trick, but another call to get_optimal_n_jobs with a proper estimate of the requested amount of RAM could be more appropriate.
In clustering_tools.py, detect_mixtures sets n_jobs to 1 in its main loop. I'm guessing this part can't be parallelized ?

yger · 2025-04-07T10:01:38Z

Ok, good to know, I'll update the PR. Meanwhile, I'll also make

a PR to bypass the estimate_templates() call needed at the end of the clustering. In fact, templates could be inferred (at the cost of slight smoothing) from the SVD components that have been precomputed and are saved in RAM.
a PR to use the new graph-based clustering that we developed with sam. Such a clustering, taking the whole probe into account as a whole, would not need to deal with duplicates and thus detect_mixtures() should not be needed anymore. That's being said, I think that this function could also be skipped in your case (and in recent version of circus2). Because the final merging has been improved, keeping some duplicates should not hurt too much, and you are welcome to give it a try. How many are removed by such a function in your case?

Thanks again for the feedback

b-grimaud · 2025-04-07T16:09:12Z

Thanks for the update !

detect_mixtures removes 6 units out of 3705, so about 0,2%. This seems entirely acceptable.

…total_memory

for more information, see https://pre-commit.ci

…tal_memory

yger and others added 5 commits February 25, 2025 10:42

Handle automatic RAM allocation for chunks

1ebb6db

[pre-commit.ci] auto fixes from pre-commit.com hooks

0ce092c

for more information, see https://pre-commit.ci

Default

cf31e7f

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

b299dd3

…tal_memory

Default

7255530

yger added sortingcomponents Related to sortingcomponents module sorters Related to sorters module labels Feb 25, 2025

yger changed the title ~~Handle automatic RAM allocation for chunks~~ Handle automatic chunks duration for SC2 Feb 25, 2025

yger and others added 5 commits February 25, 2025 21:22

Sync with main

b6dc572

WIP

26fc226

[pre-commit.ci] auto fixes from pre-commit.com hooks

3819f8c

for more information, see https://pre-commit.ci

Keeping the input dict

42d1c2c

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

d156f60

…tal_memory

yger mentioned this pull request Feb 27, 2025

SpykingCircus2 clustering crash at template estimation #3722

Open

yger and others added 5 commits February 28, 2025 12:23

Reducing memory footprint

29eb160

Patch for small num_channels

ed319bf

[pre-commit.ci] auto fixes from pre-commit.com hooks

8c1ca39

for more information, see https://pre-commit.ci

Saving the final analyzer

0a2bd23

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

b913be2

…tal_memory

zm711 reviewed Mar 2, 2025

View reviewed changes

yger and others added 10 commits March 2, 2025 21:06

Docstrings

5d88e72

[pre-commit.ci] auto fixes from pre-commit.com hooks

f746d41

for more information, see https://pre-commit.ci

More docstrings

d6b3bdb

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

a891d13

…tal_memory

[pre-commit.ci] auto fixes from pre-commit.com hooks

f2a3ac4

for more information, see https://pre-commit.ci

Cosmetic and bug fixes

b72ee86

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

151fefb

…tal_memory

Remove HDBSCAN dependency

0445d4e

Remove hdbscan

6409824

[pre-commit.ci] auto fixes from pre-commit.com hooks

b9b3457

for more information, see https://pre-commit.ci

yger and others added 8 commits March 10, 2025 09:14

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

b7de6a1

…tal_memory

Bringing back optimal n jobs

8282461

Fixes

ce56760

WIP

7cc16ce

[pre-commit.ci] auto fixes from pre-commit.com hooks

c4faee6

for more information, see https://pre-commit.ci

Merge branch 'main' of github.com:spikeinterface/spikeinterface into …

d25105f

…total_memory

Merge branch 'SpikeInterface:main' into total_memory

08213c4

Merge branch 'SpikeInterface:main' into total_memory

eae719c

b-grimaud mentioned this pull request Mar 25, 2025

Total memory limitations for PCA #3804

Open

yger and others added 10 commits March 26, 2025 10:30

Merge branch 'SpikeInterface:main' into total_memory

ba96c50

Merge branch 'main' into total_memory

68c03f3

Merge branch 'SpikeInterface:main' into total_memory

0138b17

Merge branch 'main' of github.com:spikeinterface/spikeinterface into …

bd07dc3

…total_memory

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

b68e2fb

…tal_memory

Merge branch 'main' into total_memory

b569023

WIP

08f579e

[pre-commit.ci] auto fixes from pre-commit.com hooks

c26b092

for more information, see https://pre-commit.ci

Desactivate by default

16e9a51

Merge branch 'main' of https://github.com/SpikeInterface/spikeinterface…

7559588

… into total_memory

WIP

51f753a

yger and others added 6 commits April 9, 2025 21:12

Merge branch 'main' of github.com:spikeinterface/spikeinterface into …

b7fb2bf

…total_memory

Sync with main

5b47453

[pre-commit.ci] auto fixes from pre-commit.com hooks

456e257

for more information, see https://pre-commit.ci

Merge branch 'main' into total_memory

71c9eec

Merge branch 'total_memory' of github.com:yger/spikeinterface into to…

c97dff3

…tal_memory

Merge branch 'main' into total_memory

a2add36

yger mentioned this pull request Apr 22, 2025

Non-adherence to resource parameters #3873

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle automatic chunks duration for SC2 #3721

Handle automatic chunks duration for SC2 #3721

yger commented Feb 25, 2025

zm711 left a comment

zm711 Mar 2, 2025

yger Mar 2, 2025

zm711 Mar 2, 2025

yger Mar 2, 2025

zm711 Mar 2, 2025

b-grimaud commented Apr 7, 2025

yger commented Apr 7, 2025

b-grimaud commented Apr 7, 2025

Handle automatic chunks duration for SC2 #3721

Are you sure you want to change the base?

Handle automatic chunks duration for SC2 #3721

Conversation

yger commented Feb 25, 2025

zm711 left a comment

Choose a reason for hiding this comment

zm711 Mar 2, 2025

Choose a reason for hiding this comment

yger Mar 2, 2025

Choose a reason for hiding this comment

zm711 Mar 2, 2025

Choose a reason for hiding this comment

yger Mar 2, 2025

Choose a reason for hiding this comment

zm711 Mar 2, 2025

Choose a reason for hiding this comment

b-grimaud commented Apr 7, 2025

yger commented Apr 7, 2025

b-grimaud commented Apr 7, 2025