There seems to be a slowdown in kernel launches between CUDA-Q v13.0 and 14.0(2).
The code below, produced the following timings.
import time
import cudaq
from cudaq import spin
cudaq.set_target("nvidia", option="fp64")
@cudaq.kernel
def trivial(n: int):
q = cudaq.qvector(n)
for i in range(n):
h(q[i])
H = spin.z(0) # the simplest possible observable
# ---- Warm-up: trigger JIT for both code paths ----
for _ in range(5):
cudaq.get_state(trivial, 4)
cudaq.observe(trivial, H, 4).expectation()
N = 1000
# ---- Measure get_state ----
t0 = time.time()
for _ in range(N):
cudaq.get_state(trivial, 4)
t_state_total = time.time() - t0
t_state_per = t_state_total / N
# ---- Measure observe ----
t0 = time.time()
for _ in range(N):
cudaq.observe(trivial, H, 4).expectation()
t_obs_total = time.time() - t0
t_obs_per = t_obs_total / N
print(f"cudaq: {cudaq.__version__.split()[2]}")
print(f"target: nvidia fp64")
print(f"kernel: trivial 4-qubit Hadamard")
print(f"calls: {N} per op")
print()
print(f" get_state total: {t_state_total:7.2f} s per call: {t_state_per*1000:7.3f} ms")
print(f" observe total: {t_obs_total:7.2f} s per call: {t_obs_per*1000:7.3f} ms")
Required prerequisites
Describe the bug
There seems to be a slowdown in kernel launches between CUDA-Q v13.0 and 14.0(2).
The code below, produced the following timings.
Steps to reproduce the bug
Expected behavior
Faster times
Is this a regression? If it is, put the last known working version (or commit) here.
13.0
Environment
Suggestions
No response