Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems using pytorch stream #4340

Open
ninono12345 opened this issue Jan 26, 2025 · 0 comments
Open

Problems using pytorch stream #4340

ninono12345 opened this issue Jan 26, 2025 · 0 comments

Comments

@ninono12345
Copy link

Hello, when I was working with TensorRT 8.6 I had made an engine for inference in python.

example inputs and engine:
`
im_patches = torch.randn(batch, 3, 288, 288)
train_feat = torch.randn(batch,256,18,18)
target_labels = torch.randn(1, batch, 18, 18)
train_ltrb = torch.randn(batch, 4, 18, 18)
input_shapes2 = [im_patches,train_feat,target_labels,train_ltrb]

def load_engine3(path):
with open(path, 'rb') as f,
trt.Runtime(trt.Logger(trt.Logger.WARNING)) as trt_runtime,
trt_runtime.deserialize_cuda_engine(f.read()) as engine,
engine.create_execution_context() as context:
return engine, context
`

`
def run_infer2(self, inputs):
output1 = torch.empty((1,1,18,18), dtype=torch.float32).to("cuda")
output2 = torch.empty((1,1,4,18,18), dtype=torch.float32).to("cuda")

    bindings=[inputs[0].data_ptr(),
                inputs[1].data_ptr(),
                inputs[2].data_ptr(),
                inputs[3].data_ptr(),
                output1.data_ptr(),
                output2.data_ptr()]
    
    
    print("run_infer2")

    ript = time.time()
    stream = torch.cuda.Stream("cuda")
    context.execute_async_v2(bindings=bindings, stream_handle=stream.cuda_stream)
    rint = time.time()
    
    stream.synchronize()
    
    return output1, output2

`

This was to not have to create a separate stream with pycuda or something else

Now in TensorRT 10 execute_async_v2 is no more, so I have updated my code:

`
def run_infer2(context, inputs):
output1 = torch.empty((1,1,18,18), dtype=torch.float32).to("cuda")
output2 = torch.empty((1,1,4,18,18), dtype=torch.float32).to("cuda")

context.set_tensor_address("im_patches", inputs[0].data_ptr())
context.set_tensor_address("train_feat", inputs[1].data_ptr())
context.set_tensor_address("target_labels", inputs[2].data_ptr())
context.set_tensor_address("train_ltrb", inputs[3].data_ptr())
context.set_tensor_address("scores_raw", output1.data_ptr())
context.set_tensor_address("bbox_preds", output2.data_ptr())
        

ript = time.time()
stream = torch.cuda.Stream("cuda")
current_stream = torch.cuda.current_stream()
torch.cuda.synchronize()
context.execute_async_v3(stream_handle=current_stream.cuda_stream)
torch.cuda.synchronize()
rint = time.time()
print("context.execute_v3 time: ", rint-ript)
        
stream.synchronize()
        
return output1, output2

`

But now I get this error:

[01/26/2025-17:29:16] [TRT] [W] Using default stream in enqueueV3() may lead to performance issues due to additional calls to cudaStreamSynchronize() by TensorRT to ensure correct synchronization. Please use non-default stream instead. [01/26/2025-17:29:16] [TRT] [E] IExecutionContext::enqueueV3: Error Code 1: Cuda Runtime (an illegal memory access was encountered) Traceback (most recent call last): File "D:\Tomo\tracking_tomp\pytracking\infer2.py", line 185, in <module> outt = run_infer2(context, input_shapes2) File "D:\Tomo\tracking_tomp\pytracking\infer2.py", line 165, in run_infer2 torch.cuda.synchronize() File "C:\Users\Tomas\AppData\Roaming\Python\Python310\site-packages\torch\cuda\__init__.py", line 954, in synchronize return torch._C._cuda_synchronize() RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile withTORCH_USE_CUDA_DSA` to enable device-side assertions.

[01/26/2025-17:29:18] [TRT] [E] [graphContext.h::nvinfer1::rt::MyelinGraphContext::~MyelinGraphContext::84] Error Code 1: Myelin ([::0] Error 201 destroying event '000001CC98F40A70'.)
`

Thank you

TensorRT version: 10.7
Windows 10
Nvidia drivers: 561.19
Python 3.10
CUDA: 12.4
Pytorch: 2.5.1 CUDA 12.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant