Skip to content

Commit 834b200

Browse files
yuminguwclaude
andcommitted
Add CLAUDE_CTFIRE.md with vertex convention and pipeline notes
Documents the hard-won lessons from this session so future work avoids the same pitfalls: vertex index convention (X[v] not X[v-1]), coordinate layout (col 0 = row, col 1 = col), trimxfv behavior, pipeline data flow, and the fiber overlay function. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 461db21 commit 834b200

1 file changed

Lines changed: 97 additions & 0 deletions

File tree

CLAUDE_CTFIRE.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# CTFire Python Conversion — Developer Notes
2+
3+
Critical conventions and hard-won lessons for working on `src/ctfire_py/`.
4+
5+
---
6+
7+
## Vertex Index Convention
8+
9+
**Vertex `v` is stored at `X[v]` — direct 0-based numpy index, no subtraction.**
10+
11+
The C++ backend (`extend_xlink_native`, `fiberproc_native`) uses 0-based vertex indices throughout, exactly matching numpy array positions. `extend_xlink` outputs vertices starting from index 0 with `min_v=0`.
12+
13+
After the Python `trimxfv` compacts the array, new indices are also 0-based (position in the sorted `vertices_used` list). All Python code that accesses vertex coordinates must use `X[v]`, not `X[v-1]`.
14+
15+
### Modules that access vertex coordinates
16+
17+
| File | Correct pattern |
18+
|---|---|
19+
| `utils/trimxfv.py` | `X[list(vertices_used)]` — direct index |
20+
| `fiber_processing/curvealign_filter.py` | `v1_idx = int(v)` (no `- 1`) |
21+
| `fiber_processing/fiber2beam.py` | new vertex start = `N_verts` (not `N_verts + 1`) |
22+
| `test_fire_2d.py::plot_fiber_overlay` | `X_arr[v0]` (no `- 1`) |
23+
24+
### Modules with a pre-existing off-by-one (not yet fixed)
25+
26+
These modules subtract 1 before indexing — they were written assuming 1-based indices and have worked only because vertex 0 is a phantom in the `process_fibers` output. They affect statistics and angle calculations, not the overlay:
27+
28+
- `fiber_analysis/network_stats.py``v1_idx = v1 - 1`
29+
- `fiber_analysis/fiber_angles.py``v1 = fv[0] - 1`
30+
31+
---
32+
33+
## `trimxfv.py` — How It Works
34+
35+
`trimxfv` compacts X/F/V after filtering removes some fibers:
36+
37+
1. Collect `vertices_used = sorted(unique v from all remaining fibers)`
38+
2. `X_trimmed = X[list(vertices_used), :]` — picks rows by direct index
39+
3. `old_to_new[old_v] = new_idx` — 0-based renumbering
40+
4. Fiber vertex lists are remapped through `old_to_new`
41+
5. `R_trimmed = R[list(vertices_used)]` — same direct indexing
42+
43+
**The historical bug (now fixed):** the code previously used `X[[v-1 for v in vertices_used]]`. For *consecutive* vertices this accidentally worked (the off-by-one was masked by the phantom at position 0). For *non-consecutive* vertices (e.g., after aggressive filtering), it loaded the wrong row for each vertex, causing 200–600 pixel jumps in fiber centerlines.
44+
45+
---
46+
47+
## Coordinate Layout in Vertex Arrays
48+
49+
`X[:, 0]` = **row** (image y), `X[:, 1]` = **col** (image x), `X[:, 2]` = channel (always 1 for 2D).
50+
51+
This is confirmed empirically: plotting nucleation points as `scatter(x=xlink[:,1], y=xlink[:,0])` (matplotlib convention) places dots on the actual fiber structures. Swapping gives misaligned dots.
52+
53+
---
54+
55+
## Pipeline Data Flow
56+
57+
```
58+
extend_xlink → Xz/Fz (0-based, min_v=0, max_v=len-1)
59+
60+
check_danglers → trimxfv → Xz2/Fz2 (0-based, compacted)
61+
62+
process_fibers (C++) → Xa/Fa (0-based, C++ never renumbers)
63+
64+
fiberbreak → trimxfv → Xc/Fc (0-based, compacted)
65+
66+
curvealign_filter → trimxfv → Xf/Ff (0-based, final filtered set)
67+
```
68+
69+
`Xf`/`Ff` are the correct inputs for the fiber overlay.
70+
71+
---
72+
73+
## Fiber Overlay (`plot_fiber_overlay` in `test_fire_2d.py`)
74+
75+
- Background: normalize image to `[0, 1]` with `img / img.max()` (do **not** use histogram equalization — it makes the background look unrealistic/saturated).
76+
- Centerlines: 1-pixel-thick Bresenham lines via `skimage.draw.line`.
77+
- Colors: HSV colormap cycling over `n_fibers`.
78+
- Access pattern: `X_arr[v, 0]` = row, `X_arr[v, 1]` = col — no index offset.
79+
80+
---
81+
82+
## C++ Backend Notes
83+
84+
- **`findlocmax_native`**: outputs `xlink[:,0]` = row, `xlink[:,1]` = col (verified empirically). The column naming in the C++ source (`i`=col, `j`=row) is misleading because the flat array is passed row-major from Python, making `i` iterate rows.
85+
- **`fiberproc_native / trimxfv_cpp`**: explicitly does **not** renumber vertices. Unused vertex slots remain; their `V[v].f` is empty.
86+
- **`extend_xlink_native`**: the 2D constructor is called as `ExtendXLink(sizey=J=height, sizez=I=width, ...)`. Image is accessed row-major (`image[p[0]*sizex + p[1]]`).
87+
88+
---
89+
90+
## Test Images
91+
92+
- `tests/test_images/real1.tif` — 512×512 grayscale, range [0, 255]. Bright pixel peak at (row=47, col=481).
93+
- Synthetic image: generated in `test_fire_2d.py::create_synthetic_fiber_image`.
94+
95+
## Parameters
96+
97+
`thresh_im2=5` gives dense extraction (142 filtered fibers for real1.tif). `thresh_im2=50` is more selective (~25–73 fibers). Very low thresholds include background noise as fiber-like structures.

0 commit comments

Comments
 (0)