You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CudaMalloc shouldn't be used to allocate temporary memory for a CUDA kernel. CudaMalloc is very slow, and it synchronizes the device, which is catastrophic if you are running multiple kernels at the same time.
There needs to be some sub-allocator that will allocate some memory at the start of the program and use that for temporary storage.
The text was updated successfully, but these errors were encountered:
CudaMalloc
shouldn't be used to allocate temporary memory for a CUDA kernel.CudaMalloc
is very slow, and it synchronizes the device, which is catastrophic if you are running multiple kernels at the same time.There needs to be some sub-allocator that will allocate some memory at the start of the program and use that for temporary storage.
The text was updated successfully, but these errors were encountered: