Skip to content

Add a StereoDispCombiner#2731

Draft
Hckjs wants to merge 64 commits into
mainfrom
stereo_combiner
Draft

Add a StereoDispCombiner#2731
Hckjs wants to merge 64 commits into
mainfrom
stereo_combiner

Conversation

@Hckjs
Copy link
Copy Markdown
Contributor

@Hckjs Hckjs commented Apr 2, 2025

This is basically the StereoCombiner already implemented here
in magic-cta-pipe.

It combines the mono DISP reconstruction by

  1. calculating combinations of 2 telescopes for every array event
  2. calculating the minimum distance for the 4 possible SIGN pairs for each combination
  3. calculating the weighted average for the minimum distance SIGN pair (fov lon/lat) per combination
  4. calculating the weighted average of all combinations for an array event with the summed weights from 3).

Update:
The StereoDispCombiner has essentially been generalized so that it can be configured to reproduce both the magic-cta-pipe and the EventDisplay behavior.

  • The number N of telescope combinations was introduced as a traitlet.
  • In addition, it is now possible to choose how many of the best telescopes, ranked by their weights, should participate in the averaging.
  • Furthermore, a check analogous to the one in EventDisplay was introduced to reject subarray events with two participating telescopes whose shower axes are nearly parallel.

For each telescope combination of size n_tel_combinations, all possible DISP sign assignments are then evaluated, and the sign combination that minimizes the Sum of Squared Errors (SSE) between the participating telescopes is selected. The resulting per-combination FoV positions are then combined using the selected telescope weights, and a weighted mean FoV direction is computed for each subarray event.

If a subarray event has an valid telescope multiplicity smaller than n_tel_combinations (after applying n_best_tels and any angular-difference cuts), but at least two telescopes remain, the reconstruction is performed using all available telescopes of that subarray event. In this case, only one combination is formed, with a size equal to the event multiplicity, and the optimal DISP sign assignment is determined accordingly. Single-telescope events are handled separately using just the mono reconstruction.

Some hints for reviewing are here

@Hckjs Hckjs requested review from LukasBeiske, kosack and maxnoe April 2, 2025 16:50
@Hckjs Hckjs changed the title Adding a StereoDispCombiner Add a StereoDispCombiner Apr 2, 2025
Comment thread src/ctapipe/image/statistics.py Outdated
Comment thread src/ctapipe/reco/stereo_combination.py
Comment thread src/ctapipe/reco/stereo_combination.py Outdated
@Hckjs Hckjs force-pushed the stereo_combiner branch from 8ecf073 to e998c78 Compare May 1, 2025 18:28
@LukasBeiske LukasBeiske force-pushed the stereo_combiner branch 3 times, most recently from 47d2798 to 8902c7f Compare May 15, 2025 11:45
Copy link
Copy Markdown
Member

@kosack kosack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a predicted disp.sign_score that is not used in this method, which seems odd. Is that the reason why the pair-wise average including both disps is made rather than a simple weighted average over predictions? Is there citation for the MARS method, by the way? i'd be interested to see if simpler methods were tested.

Comment thread src/ctapipe/reco/stereo_combination.py Outdated
@LukasBeiske
Copy link
Copy Markdown
Contributor

LukasBeiske commented May 15, 2025

We have a predicted disp.sign_score that is not used in this method, which seems odd. Is that the reason why the pair-wise average including both disps is made rather than a simple weighted average over predictions? Is there citation for the MARS method, by the way? i'd be interested to see if simpler methods were tested.

This sign_score is mostly meant for performance plots (sign_score = np.abs(2 * sign_proba - 1.0) where sign_proba is the output of the sign classifier).
The weighted average over telescope-wise (alt, az) predictions is what the StereoMeanCombiner does for the disp reconstruction, but this method here seemed to perform better in a master thesis in Dortmund some years ago. We can add some performance plots here once the remaining issues are figured out.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented May 15, 2025

The weighted average over telescope-wise (alt, az) predictions is what the StereoMeanCombiner does for the disp reconstruction, but this method was proven to perform better in a master thesis in Dortmund some years ago.

It's also easy to argue why: especially at low energies, disp sign is not really reconstructible (random guess, sign_score ~ 0.5). But with multiple telescopes, it's quite easy to see which option is actually correct and make the weighted average over the correct locations.

There are still a couple of different options for stereo disp, but most of them do not actually use the sign prediction, that is only used for mono as far as I know.

The MAGIC MARS method is explained in a short paragraph here:

Foreach of them, the disp identifies two possible reconstructed
source positions (head-tail ambiguity). At this stage for
each stereoscopic event, melibea evaluates all four possible
combinations of position pairs, and chooses the closest pair,
but only if its distance is smaller than a certain value. If none
of the pairs satisfies this condition, the event is rejected. The
stereo-reconstructed position is determined as the weighted
average of the chosen pair of positions.

https://cbpf.br/icrc2013/papers/icrc2013-0773.pdf

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Nov 3, 2025

We have a predicted disp.sign_score that is not used in this method, which seems odd.

Not really, this is more of a book-keeping for doing lower-level performance plots. It could be used for a quality query or similar, but AFAIK, this was not yet done anywhere and would be a completely new development that needs to be tested. Would be a nice study though.

@Hckjs
Copy link
Copy Markdown
Contributor Author

Hckjs commented Nov 3, 2025

We have a predicted disp.sign_score that is not used in this method, which seems odd.

Not really, this is more of a book-keeping for doing lower-level performance plots. It could be used for a quality query or similar, but AFAIK, this was not yet done anywhere and would be a completely new development that needs to be tested. Would be a nice study though.

During the quantitative analysis of the StereoDispCombiner in its current form, i noticed that particularly high-energy events with a high scatter radius have an almost parallel reconstructed shower axis (often events with only 2 valid telescopes). This worsens performance compared to the StereoMeanCombiner at high energies because it isnt just averaging but choosing the complete wrong side. However, since this occurs mostly at high energies, the sign scores are also significantly higher, which I tested as a weighting factor in the selection of the minimum distance and was able to achieve significantly better performance for high-energy events. So far, I have only tested this on small data sets, but I will upload a few comparison plots soon.

@Hckjs
Copy link
Copy Markdown
Contributor Author

Hckjs commented Nov 17, 2025

combiner_theta2_reco_lon_lat_zen_20_az_0_ac_full_array.pdf
combiner_irfs_sens_zen_20_az_0_ac_full_array.pdf

The StereoDispCombinerSS is using the mentioned sign_score weightings on the minimum distance calculation. The effect doesnt seem to be as high as expected in the beginning...
I've also tried a KMeans and a DBScan approach having a quite similar performance compared to the DispCombiner. However they are slower because of a way less efficient table-wise processing.

@Hckjs Hckjs added enhancement algorithm module:reco issues related to ctapipe.reco labels Nov 26, 2025
@ctao-sonarqube
Copy link
Copy Markdown

Quality Gate failed Quality Gate failed

Failed conditions
1 New issue
3.5% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE SonarQube for IDE

Comment thread src/ctapipe/reco/stereo_combination.py Outdated
Comment thread src/ctapipe/reco/stereo_combination.py Outdated
@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Nov 27, 2025

I'm not quite sure I understand the logic of weighting distances, which should be related to the absolute value of disp, with the sign score. Especially since we are trying out all combinations, effectively ignoring the predicted sign.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 1, 2025

Sorry for the above message, I forgot about the comment of near-parallel high energy showers. So penalizing choosing what the classifier thinks is a clearly wrong sign is good I think.

I'm not sure though that weighting the distance achieves this goal.

For your comparison plots, it might help to include large percentiles to see differences. It expect this to mostly affect rare events, so the effect is probably only seen in something like a 90 % or 95 % containment, not 68 %.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Dec 1, 2025

Could you also maybe expand a bit what datasets you are using here and how they have been processed?
Maybe also add the HillasReconstructor to the comparison?

Comment thread src/ctapipe/reco/stereo_combination.py Outdated
Comment thread src/ctapipe/reco/stereo_combination.py Outdated
Comment thread src/ctapipe/reco/stereo_combination.py Outdated
Comment thread src/ctapipe/reco/stereo_combination.py
Hckjs added 2 commits February 4, 2026 09:04
 - reduce .copy() when unecessary
 - remove indirection in __call__
@Hckjs
Copy link
Copy Markdown
Contributor Author

Hckjs commented Feb 4, 2026

Update:

  • The algorithm has been generalized by n_tel_combinations, allowing the EventDisplay behavior to be reproduced.
    • A configurable min_ang_diff cut was added to reject events with nearly parallel shower axes (for multiplicity = 2).
    • A configurable n_best_tels trait was added to limit the stereo combination to the N telescopes with the highest weights

Open points for discussion

  • I set the property trait of the StereoDispCombiner to config=False, since it is not meant to be user-configurable (as discussed). At the same time, .from_name() in the Reconstructor subclasses still works and raises a proper error if, for example, a user assigns the StereoDispCombiner to the EnergyReconstructor via the config-system. I am not fully sure whether this is in line with the intended architecture, as we currently do not have other non-configurable traits.

  • For event-wise processing, I integrated the StereoQualityQuery. I am unsure whether it makes sense to define an additional QualityQuery for table-wise processing as well; if so, we may need to revisit the naming.

  • Since the is_valid field was removed from the DISP container, I added an explicit double check:

    if not dl2.geometry[self.prefix].is_valid:
        continue
    if self.prefix not in dl2.disp:
        raise RuntimeError(
            f"No valid DISP reconstruction parameter found for "
            f"prefix='{self.prefix}'. "
            f"Make sure to apply the DispReconstructor before using the "
            f"StereoDispCombiner or adapt the prefix accordingly."
        )

@Hckjs Hckjs marked this pull request as ready for review February 4, 2026 08:48
@ctao-sonarqube
Copy link
Copy Markdown

ctao-sonarqube Bot commented Feb 4, 2026

@Hckjs Hckjs requested review from kosack and maxnoe February 4, 2026 09:24
@Hckjs
Copy link
Copy Markdown
Contributor Author

Hckjs commented Apr 3, 2026

The StereoDispCombiner table-wise implementation in predict_table() became somewhat more complex through this generalization, so I wanted to add a few notes on the reasoning behind the implementation to make the review a bit easier. The event-wise processing was comparatively straightforward, so I leave that aside here.

With the introduction of the n_tel_combinations trait, one difficulty in the broadcasting-based table-wise implementation was that the dimension of the combinations becomes variable for subarray events with multiplicities smaller than n_tel_combinations. I considered several approaches for handling this:

  • Variable-length arrays: in principle possible, but since we do not have a direct dependency on packages that support simple broadcasting and vectorized operations on variable-length arrays, I wanted to avoid this route.
  • Masked arrays + padding: also possible in principle. I did look into this approach and thought about what the final implementation would need to look like, but I came to the conclusion that it would make the already fairly complex code significantly harder to maintain.
  • A small for-loop over multiplicities < n_tel_combinations: this is the approach I ultimately chose. It allows me to reuse parts of the event-wise processing, keeps the overall code more maintainable, and since n_tel_combinations is not expected to be chosen very large anyway, the loop remains small. In practice, values larger than about 5 are not really advisable because the computation time increases rapidly. For example, with n_tel_combinations = 5, the loop only has to cover multiplicities 4, 3, and 2, so the additional overhead remains manageable.

A few additional notes on _compute_altaz_for_valid() for easier understanding:

  • The main idea was to construct an array that allows me to compute the minimum SSE for all combinations via broadcasting. For this, I use the index_tel_combs array, which essentially contains all telescope combinations in sequence, with telescope indices of the input table as elements.
  • To make this work, I needed a few helper functions for the computation itself, as well as mappings to assign the results back to the corresponding subarray events, most notably combs_to_array_indices.
  • Beforehand, I generate a combs_array, which precomputes all k-combination patterns up to max_multiplicity. This is then later used with the actual tel_indices to construct the final index_tel_combs array.
  • As mentioned above, this is only done for multiplicities >= n_tel_combinations. Afterwards, I loop over all smaller multiplicities, since in those cases each event has only a single combination with size equal to its multiplicity.
  • Finally, single-telescope events are simply filled with the mono predictions.

Hope that helps a bit

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Apr 7, 2026

Variable-length arrays: in principle possible, but since we do not have a direct dependency on packages that support simple broadcasting and vectorized operations on variable-length arrays, I wanted to avoid this route.

I'm not sure that this is the right approach here, but we already do something like this in the number of islands code, where we use the internal representation of a scipy csr sparse matrix, which essentially is an array of variable length arrays.

@maxnoe
Copy link
Copy Markdown
Member

maxnoe commented Apr 7, 2026

You could write a numba function that takes a single 1d array of all disp values and an array of multiplicities per stereo event. You can then index into the 1d array of disp values.

is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the array dimensions are inferred from the index arrays.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_array.html

return stereo_fov_lon, stereo_fov_lat, True

if (
multiplicity == 2
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is two a special case here? Don't we need to check min_ang_diff for each pair and check if we got any pairs left after that check?

@Hckjs
Copy link
Copy Markdown
Contributor Author

Hckjs commented Apr 18, 2026

Why is two a special case here? Don't we need to check min_ang_diff for each pair and check if we got any pairs left after that check?

In a first quick check, it looked like the largest performance losses related to near-parallel main shower axes occur mainly for multiplicity n=2. At the same time, I had found an old comment in the EventDisplay code suggesting that this was also only checked for multiplicity 2, although I realized now that this comment is outdated.

After looking through the EventDisplay implementation more carefully, I think you are probably right that it makes more sense to apply this to all events. In EventDisplay, the code computes the mean angle over all pairwise telescope-axis combinations and then applies a cut on this averaged angle. So in that implementation the full event is either kept or rejected; it does not just remove the individual 2-telescope combinations with small angular separation.

I also noticed that EventDisplay uses a more L1-like minimum-distance criterion, i.e. a weighted mean pairwise distance, rather than the SSE that is currently implemented in this PR. The SSE penalizes outliers more strongly, which can be beneficial when the correct sign assignment leads to a tight cluster of telescope-wise solutions. On the other hand, the Euclidean distance criterion is probably somewhat more robust against poorly reconstructed main shower axes.

In addition, for multiplicities m>5, EventDisplay uses the Hillas-intersection result as a starting point and then chooses for each telescope the DISP sign whose solution lies closest to that Hillas-intersection estimate. This also seems broadly consistent with the results from Lukas Nickel’s Master’s thesis, where the geometric/Hillas reconstruction performs better at higher multiplicities.

It also looks like EventDisplay does not apply any explicit n_best_telescopes selection, at least not in the DISP combination logic I checked.

To reproduce the EventDisplay behavior more exactly for benchmarking studies, I would otherwise like to implement these three points:

  • Averaged angle cut
  • weighted mean pairwise distance instead of SSE (make it optional maybe for performance testing)
  • Make use of HillesReconstructor estimate on m > 5

I am just not sure how quickly I will manage that over the next days or weeks, since I am currently finishing my thesis.

You could write a numba function that takes a single 1d array of all disp values and an array of multiplicities per stereo event. You can then index into the 1d array of disp values.

is the standard CSR representation where the column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]]. If the shape parameter is not supplied, the array dimensions are inferred from the index arrays.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_array.html

That looks like a good alternative. I am just not sure yet whether, and by how much, it would actually be faster in practice. I would need to test that first.

Thanks for the comments!

@Hckjs Hckjs marked this pull request as draft April 20, 2026 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

algorithm enhancement module:reco issues related to ctapipe.reco

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants