Skip to content

Commit c0fdd76

Browse files
adamltysongitbook-bot
authored andcommitted
GitBook: [master] 55 pages modified
1 parent 451623d commit c0fdd76

File tree

6 files changed

+160
-4
lines changed

6 files changed

+160
-4
lines changed

SUMMARY.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@
2121
* [Cell candidate detection](user-guide/usage/cell-candidate-detection.md)
2222
* [Cell candidate classification](user-guide/usage/cell-candidate-classification.md)
2323
* [Historical options](user-guide/usage/historical-options.md)
24-
* [Training the network](user-guide/untitled-1.md)
24+
* [Training the network](user-guide/untitled-1/README.md)
25+
* [Using supplied training data](user-guide/untitled-1/using-supplied-training-data.md)
2526
* [Visualisation](user-guide/visualisation.md)
2627
* [Group-level analysis](user-guide/group-level-analysis/README.md)
2728
* [Summarising multiple cell counts](user-guide/group-level-analysis/untitled-2.md)

installation/setting-up-your-gpu.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ description: How to speed up cellfinder by using your GPU
66

77
## Introduction
88

9-
**cellfinder** will run quite happily on your CPU, but the machine learning parts \(classifying cell candidates as cells or artefacts, and [Training the network](../user-guide/untitled-1.md)\) **run much faster** using GPU.
9+
**cellfinder** will run quite happily on your CPU, but the machine learning parts \(classifying cell candidates as cells or artefacts, and [Training the network](../user-guide/untitled-1/)\) **run much faster** using GPU.
1010

1111
#### Requirements
1212

user-guide/data-requirements.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ cellfinder was written to analyse certain kinds of whole brain microscopy datase
1212

1313
For registration, you only need a single channel, but this is ideally a "background" channel, i.e. one with only autofluroescence, and no other strong signal. Typically we acquire the "signal" channels with red or green filters, and then the "background" channel with blue filters.
1414

15-
For cell detection, you will need two channels, the "signal" channel, and the "background" channel. The signal channel should contain brightly labelled cells \(e.g. from staining or viral injections\). The models supplied with cellfinder were trained on whole-cell labels, so if you have e.g. a nuclear marker, they will need to be retrained \(see [Training the network](untitled-1.md)\). However, realistically, the network will need to be retrained for every new application
15+
For cell detection, you will need two channels, the "signal" channel, and the "background" channel. The signal channel should contain brightly labelled cells \(e.g. from staining or viral injections\). The models supplied with cellfinder were trained on whole-cell labels, so if you have e.g. a nuclear marker, they will need to be retrained \(see [Training the network](untitled-1/)\). However, realistically, the network will need to be retrained for every new application
1616

1717
### Image structure
1818

user-guide/getting-started.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -37,5 +37,5 @@ If you have any spaces in your file-path, please enclose it in quotation marks,
3737

3838
### Retraining the machine learning network to classify cells
3939

40-
The deep learning network included with cellfinder to classify cells as real cells or artefacts was trained on a very specific dataset. You will very likely need to retrain this if the classification is incorrect on your data. See [Training the network](untitled-1.md).
40+
The deep learning network included with cellfinder to classify cells as real cells or artefacts was trained on a very specific dataset. You will very likely need to retrain this if the classification is incorrect on your data. See [Training the network](untitled-1/).
4141

user-guide/untitled-1/README.md

+104
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
# Training the network
2+
3+
Cellfinder includes a pretrained network for cell candidate classification. This will likely need to be retrained for different applications. Rather than generate training data blindly, the aim is to reduce the amount of hands-on time by only generating training data where cellfinder classified a cell candidiate incorrectly.
4+
5+
## Generate training data
6+
7+
To generate training data, you will need:
8+
9+
* The cellfinder output file, `cell_classification.xml` \(but `cells.xml` can also work\).
10+
* The raw data used initially for cellfinder
11+
12+
To generate training data for a single brain, use `cellfinder_curate`:
13+
14+
```bash
15+
cellfinder_curate signal_images background_images cell_classification.xml
16+
```
17+
18+
### Arguments
19+
20+
* Signal images
21+
* Background images
22+
* `cell_classification.xml` file
23+
24+
{% hint style="info" %}
25+
You must also specify the pixel sizes, see [Specifying pixel size](../usage/specifying-pixel-size.md)
26+
{% endhint %}
27+
28+
**Optional**
29+
30+
* `-o` or `--output` Output directory for curation results. If this is not given, then the directory containing `cell_classification.xml` will be used.
31+
* `--symbol` Marker symbol \(Default: `ring`\)
32+
* `--marker-size` Marker size\(Default: `15`\)
33+
* `--opacity` Marker opacity \(Default: `0.6`\)
34+
35+
A [napari](https://napari.org/) window will then open, showing two tabs on the left hand side:
36+
37+
* `Image` Selecting this allows you to change the contrast limits, to better visualise cells
38+
* `Cell candidates` This shows the cell candidates than be curated. Cell
39+
40+
candidates previously classified as cells are shown in yellow, and artifacts
41+
42+
in blue.
43+
44+
By selecting the `Cell candidates` tab and then the cell selecting tool \(arrow at the top\), cell candidates can be selected \(either individually, or many by dragging the cursor\). There are then four keyboard commands:
45+
46+
* `C` Confirm the classification result, and add this to the training set
47+
* `T` Toggle the classification result \(i.e. change the classification\),
48+
49+
and add this to the training set.
50+
51+
* `Alt+Q` Save the results to an xml file
52+
* `Alt+E` Finish curating the training dataset. This will carry out three operations:
53+
* Extract cubes around these points, into two directories \(`cells` and `non_cells`\).
54+
* Generate a yaml file pointing to these files for use with `cellfinder_train` \(see below\)
55+
* Close the viewer
56+
57+
Once a `yaml` file has been generated, you can proceed to training. However, it is likely useful to generate `yaml` files from additional datasets.
58+
59+
## Start training
60+
61+
You can then use these yaml files for training
62+
63+
_N.B. If you have any yaml files from previous versions of cellfinder, they will continue to work, but are not documented here. Just use them as you would the files from `cellfinder_curate`._
64+
65+
```bash
66+
cellfinder_train -y yaml_1.yml yaml_2.yml -o /path/to/output/directory/
67+
```
68+
69+
### Arguments
70+
71+
* `-y` or `--yaml` The path to the yaml files defining training data
72+
* `-o` or `--output` Output directory for the trained model \(or model weights\)
73+
74+
results
75+
76+
**Optional**
77+
78+
* `--continue-training` Continue training from an existing trained model. If no model or model weights are specified, this will continue from the included model.
79+
* `--trained-model` Path to a trained model to continue training
80+
* `--model-weights` Path to existing model weights to continue training
81+
* `--network-depth` Resnet depth \(based on [He et al. \(2015\)](https://arxiv.org/abs/1512.03385)\). Choose from
82+
83+
\(18, 34, 50, 101 or 152\). In theory, a deeper network should classify better,
84+
85+
at the expense of a larger model, and longer training time. Default: 50
86+
87+
* `--batch-size` Batch size for training \(how many cell candidates to process at once\). Default: 16
88+
* `--epochs` How many times to use each sample for training. Default: 1000
89+
* `--test-fraction` What fraction of data to keep for validation. Default: 0.1
90+
* `--learning-rate` Learning rate for training the model
91+
* `--no-augment` Do not use data augmentation
92+
* `--save-weights` Only store the model weights, and not the full model. Useful to save storage space.
93+
* `--no-save-checkpoints` Do not save the model after each training epoch. Useful to save storage space, if you are happy to wait for the chosen number of epochs to complete. Each model file can be large, and if you don't have much training data, they can be generated quickly.
94+
* `--tensorboard` Log to `output_directory/tensorboard`. Use `tensorboard --logdir outputdirectory/tensorboard` to view.
95+
* `--save-progress` Save training progress to a .csv file \(`output_directory/training.csv`\).
96+
97+
### Further help
98+
99+
All `cellfinder_train` options can be found by running:
100+
101+
```bash
102+
cellfinder_train -h
103+
```
104+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
description: How to retrain the network using the supplied data
3+
---
4+
5+
# Using supplied training data
6+
7+
cellfinder is released with a pre-trained cell candidate classification network, trained on approximately 100,000 manually annotated cell candidates \(with a roughly 50/50 split between cells and non-cells\).
8+
9+
This data was acquired using [serial two-photon tomography](https://www.nature.com/articles/nmeth.1854). While you will likely need to retrain the network for your own data, we make the data available for a few reasons:
10+
11+
* You might want to use this data to test the training, or assess how much training data you may need
12+
* You might want to retrain a different network \(i.e. a different ResNet depth\) than the one supplied \(50-layer\).
13+
* You might want to retrain the network using a mixture of this data \(of which there is a lot\) and your data \(of which you may not be able to generate as much\).
14+
15+
The data is available [here](https://gin.g-node.org/cellfinder/training_data/raw/master/serial2p.tar.gz). To retrain the network using just this data, download the data, extract the tar archive, and then follow these steps:
16+
17+
{% hint style="info" %}
18+
If you're using Windows, you will need to edit `training.yml` so that the paths \(in each `cube_dir` and `cell_def` entry\) match windows paths \(i.e. backslashes\)
19+
{% endhint %}
20+
21+
* Activate your conda environment:
22+
23+
```text
24+
conda activate cellfinder
25+
```
26+
27+
* Navigate to the training data directory
28+
29+
```text
30+
cd serial2p
31+
```
32+
33+
* Start training
34+
35+
```text
36+
cellfinder_train -y training.yml -o training_output
37+
```
38+
39+
The training will likely take a few minutes to get going, once the network starts you should see something like this:
40+
41+
```text
42+
Epoch 1/100
43+
1/6050 [..............................] - ETA: 0s - loss: 0.9579 - accuracy:
44+
2/6050 [..............................] - ETA: 1:33:47 - loss: 3.1335 - accur
45+
3/6050 [..............................] - ETA: 3:10:17 - loss: 2.6173 - accur
46+
4/6050 [..............................] - ETA: 4:03:42 - loss: 2.2663 - accur
47+
5/6050 [..............................] - ETA: 4:30:16 - loss: 2.0002 - accur
48+
```
49+
50+
51+

0 commit comments

Comments
 (0)