feat: AdePT physics list plugin for e+/e-/gamma offload to GPU [WIP]#1606
feat: AdePT physics list plugin for e+/e-/gamma offload to GPU [WIP]#1606wdconinc wants to merge 58 commits intoAIDASoft:masterfrom
Conversation
Test Results 18 files 18 suites 6h 13m 49s ⏱️ For more details on these failures, see this check. Results for commit a70d61a. ♻️ This comment has been updated with latest results. |
Yes, I noticed when filing the PR that the unrelated DDCore change snuck in. I'll remove it when I am back at a computer. |
|
cc @SeverinDiederichs (FYI) |
When AdePT's callUserTrackingAction=false (the default for performance), GPU-produced hits and hadronic secondaries carry trackID/parentID=0 from the dummy HostTrackData. This caused two classes of errors: 1. 'No Equivalent particle for track:0' (from Geant4ParticleMap::particleID) GPU hits have trackID=0, and when Geant4Output2ROOT tries to remap them to final particle IDs, it calls particleID(0) which fails. Fix: in Geant4ParticleHandler::endEvent(), after rebaseSimulatedTracks, add m_equivalentTracks[0] pointing to the primary particle (g4id=1) so that all GPU hits with dummy trackID=0 are correctly attributed. 2. Hadronic secondaries returned from GPU with parentID=0 breaking the MC truth parent chain. Fix: Geant4AdePTUserParticleHandler::begin() remaps particle.g4Parent from 0 to the entering primary's G4 track ID. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Together with apt-sim/AdePT#546 this now offloads tracks to my (very crappy) GPU.
|
…ePTPhysics LastNParticlesOnCPU: when the in-flight count drops below this threshold the remaining particles are leaked back to Geant4/HepEm on CPU, terminating the GPU transport loop early. Setting this to a small value (e.g. 10-100) avoids launching many near-empty kernels during the long shower tail. Default 0 preserves the previous behaviour (always finish on GPU). SpeedOfLight: debug/benchmark mode that kills all e-/e+/gamma immediately without tracking them (equivalent to setting their mean free path to zero). Useful for measuring geometry or non-EM overhead in isolation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document and expose the property in the example steering file with a comment explaining its effect on GPU kernel launch efficiency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In ddsim-mt (PR#1240), Geant4ParticleHandler is created inside
__setupGeneratorActions, which runs as a UserInitialization callback
during G4RunManager::Initialize() -- after setupPhysics has returned.
The previous approach of looking up the handler via
kernel.generatorAction().get('ParticleHandler')
inside setup_physics no longer works: the object either doesn't exist
yet or its adopt() method is not available on the returned wrapper.
The FIXME in ParticleHandler.py noted that setupUserParticleHandler was
not extensible: it hardcoded only Geant4TCUserParticleHandler and
Geant4TVUserParticleHandler and called exit(1) for anything else.
Add an 'else' branch that supports arbitrary DDG4 action plugin class
names: create the action and call part.adopt(user) without any special
tracker-region configuration. This allows plugins such as
Geant4AdePTUserParticleHandler to be registered simply via:
runner.part.userParticleHandler = "Geant4AdePTUserParticleHandler"
Update AdePTSteeringFile.py to use this clean mechanism in place of the
monkey-patch workaround introduced in the previous commit.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Juan Miguel Carceller <jmcarcell@users.noreply.github.com>
Co-authored-by: sss <sss@karma>
Halton sequences are low-discrepancy sequences that fill phase space with faster variance reduction (1/N) than standard uniform point picking with PRNGs (1/sqrt(N)). It's not cheating statistics since you lose the Poisson statistical properties between two consecutive events. This technique is often referred to as RQMC, randomized quasi-Monte Carlo. This adds scrambled Halton sequence support to the isotrope generators (where inter-event statistics are not considered since they don't represent real experimental running conditions). The scrambling uses Cranley-Patterson rotation, which is sufficient to remove correlations in three dimensional phase space sampling. The sequences are scrambled with the random seed, so different runs with different seeds will produce different sequences. This also then allows statistical treatment to determine the errors on aggregate quantities (see note in ddsim help). The various distributions are modified to take a sampler function that can either use PRNG or Halton sequences. For FFbar this is not possible since it uses an accept/reject algorithm that only works for PRNG.
Add basic MT functionality tests with 1, 2, and 4 threads, file-based generator tests (HepMC3, EDM4hep), and a comparison script framework for validating ST vs MT equivalence. Tests verify: - MT mode runs without crashes - Different thread counts (1, 2, 4) work correctly - File-based input generators work in MT mode - Backward compatibility with -j 1 (single-threaded) Fix double-save bug in EDM4hep/LCIO/ROOT output for ST mode In single-threaded mode, events were saved twice because setupEDM4hepOutput/setupLCIOOutput/setupROOTOutput hardcoded shared=True. Fixed by making the shared flag conditional on NumberOfThreads > 1. Fix SIGSEGV crash in MT mode: make EventSeeder shared setupEventSeeder() was called once per worker thread, creating multiple EventSeeder instances with shared=False. During cleanup this caused conflicts/double-free leading to SIGSEGV. Fixed by creating EventSeeder with shared=True (one instance shared across all workers) and guarding against duplicate creation. Add tests for G4Gun and GPS with macroFile These tests document that G4Gun and GPS with macroFile work in ST mode but not in MT mode (macros execute during global init before worker threads exist). Generator setup is guarded with numberOfThreads == 1. fix: additional DDTest changes
Fixes heap corruption and SIGSEGV crashes when using ROOT output in multi-threaded mode. Root cause: Multiple Geant4 worker threads were accessing ROOT I/O objects concurrently. ROOT's I/O system is not thread-safe by default, causing heap corruption during multi-threaded writes that manifested during exit in TFile::WriteStreamerInfo / TROOT::CloseFiles. Changes: 1. Call ROOT.EnableThreadSafety() before any ROOT objects are created when numberOfThreads > 1 (MT mode). 2. Add static std::mutex s_rootMutex to Geant4Output2ROOT and protect all ROOT I/O operations with std::lock_guard: - commit(): TTree::Fill() and branch operations - closeOutput(): file Write() and Close() - beginRun(): file creation and opening - fill(): branch Fill() operations The mutex ensures full serialization of ROOT I/O across all worker threads, preventing concurrent access to TFile/TTree/TBranch objects even with ROOT::EnableThreadSafety() in place. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
As a defense-in-depth measure alongside the primary fix in AdePT's HostTrackDataMapper (cpuAncestorG4id propagation), force any track whose G4 track ID is in the GPU-assigned range (>= INT_MAX/2, counting down from INT_MAX) to have G4PARTICLE_ABOVE_ENERGY_THRESHOLD set in particle.reason. This ensures such tracks enter the m_particleMap if-branch in Geant4ParticleHandler::end() rather than the else-branch that walks the parent chain and emits 'FATAL: No real particle parent present' when the chain is broken by an unregistered GPU-assigned parent ID. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When AdePT returns GPU-processed tracks to CPU, the returned tracks carry GPU-assigned track IDs (counting down from INT_MAX). When G4HepEm then handles these tracks and produces hadronic secondaries (e.g. photo-nuclear products W182/W183/W184/W186, neutrons, protons) inline -- before the GPU track PostUserTrackingAction fires -- those secondaries have a GPU-range parentID that is not yet registered in m_particleMap/m_equivalentTracks, causing "FATAL: No real particle parent present" errors. Fix in Geant4AdePTUserParticleHandler: - In begin(): if track->GetParentID() is GPU-range, resolve g4Parent by looking up the parent in m_trackCache (populated when the GPU parent itself began). This replaces the GPU parent ID with the CPU ancestor ID. - In end() fallback: if track->GetParentID() is GPU-range, apply the same cache-based resolution rather than blindly using GetParentID(), which would undo the begin()-time fix for hadronic secondaries not in m_trackCache. - In end(): GPU-assigned track IDs are forced into m_particleMap (if-branch) via G4PARTICLE_ABOVE_ENERGY_THRESHOLD so they are always registered. - Update cache on end() instead of erasing, so a second end() call (for tracks that re-enter the GPU region) can still restore correct state. Fix in Geant4ParticleHandler (core, minimal): - Add cycle detection (std::set<int> visited) in the m_equivalentTracks walk in end() and rebaseSimulatedTracks() to prevent infinite loops if a self-referential entry is created by any remaining edge case. Also update AdePTSteeringFile.py to use adequate slot sizes (10M) for realistic simulation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
(Copilot got ahead of itself, merged ddsim-mt into this, and pushed it up. I'll revert this.) |

BEGINRELEASENOTES
ENDRELEASENOTES
This PR adds an AdePT physics list plugin for DD4hep (similar to the celeritas physics list plugin in https://github.com/celeritas-project/celeritas/).
Notes on AdePT integration approach:
Geant4AdePTPhysicsaction, and must be added with a helper function tosetupUserPhysicsin DDsim. This is added through the steering file.callUserTrackingAction=trueis required, since we need aGeant4AdePTUserParticleHandlerto 'repair' the track/particle after it comes back from the GPU to the CPU. This is also added by the steering file.Notes on DD4hep core changes:
SiD.xmlgets anEcalRegionfor testing.m_currTrack:https://github.com/wdconinc/DD4hep/blob/a70d61afc812dc027990f29f8a351274fae11d59/DDG4/src/Geant4ParticleHandler.cpp#L349-L350
DDSim/Helper/ParticleHandler.py the passthrough of a generic particle handler.