-
Notifications
You must be signed in to change notification settings - Fork 196
Description
The scenario:
5 Threads concurrently calling clfftBakePlan with identically configured fft handles.
Immediate symptoms:
The assert(NULL == p) in repo.cpp, line 218 triggers.
Line 218 in c59712e
| assert (NULL == p); |
Then occasionally a crash with a nullptr later on.
The cause:
The function FFTAction::compileKernels will compile kernels, but only if they are not cached already. The problem is that the query of the cache is not protected with a mutex.
Line 713 in c59712e
| if( fftRepo.getclProgram( this->getGenerator(), this->getSignatureData(), program, q_device, fftPlan->context ) == CLFFT_INVALID_PROGRAM ) |
- five threads concurrently try to
compileKernelsfor the first time - all threads will query the
fftRepoat the same time - all threads will get a
CLFFT_INVALID_PROGRAMreturn code. - Consequently, all five threads assume that the kernel has not been cached and will compile the kernel and
- all threads will call
fftRepo.setclProgramwith the same parameters.
The first call will set the program, the next calls will trigger the assert.
The fix:
Any query to the cache followed by a set to a cache must be an atomic operation. Here a scopedLock would do the trick.
I could prepare a PR, but can only take the time to do so if the PR has a chance of being merged into the code. Is this repository still being maintained? Also, I'd like the fix to be integrated with vcpkg.