Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation] Memory Leak in TensorRTProvider example #23901

Open
axbycc-mark opened this issue Mar 5, 2025 · 3 comments
Open

[Documentation] Memory Leak in TensorRTProvider example #23901

axbycc-mark opened this issue Mar 5, 2025 · 3 comments
Labels
documentation improvements or additions to documentation; typically submitted using template

Comments

@axbycc-mark
Copy link

Describe the documentation issue

According to the C API, the TensorRTProviderOptionsV2 must be released.

  /** \brief Create an OrtTensorRTProviderOptionsV2
   *
   * \param[out] out Newly created ::OrtTensorRTProviderOptionsV2. Must be released with OrtApi::ReleaseTensorRTProviderOptions
   *
   * \snippet{doc} snippets.dox OrtStatus Return Value
   */
  ORT_API2_STATUS(CreateTensorRTProviderOptions, _Outptr_ OrtTensorRTProviderOptionsV2** out);

Here is the current example code from the Onnx Runtime TensorRT documentation. You will see a call to CreateTensorRTProviderOptions without a corresponding call to ReleaseTensorRTProviderOptions.

Ort::SessionOptions session_options;

const auto& api = Ort::GetApi();
OrtTensorRTProviderOptionsV2* tensorrt_options;
Ort::ThrowOnError(api.CreateTensorRTProviderOptions(&tensorrt_options));

std::vector<const char*> option_keys = {
    "device_id",
    "trt_max_workspace_size",
    "trt_max_partition_iterations",
    "trt_min_subgraph_size",
    "trt_fp16_enable",
    "trt_int8_enable",
    "trt_int8_use_native_calibration_table",
    "trt_dump_subgraphs",
    // below options are strongly recommended !
    "trt_engine_cache_enable",
    "trt_engine_cache_path",
    "trt_timing_cache_enable",
    "trt_timing_cache_path",
};
std::vector<const char*> option_values = {
    "1",
    "2147483648",
    "10",
    "5",
    "1",
    "1",
    "1",
    "1",
    "1",
    "1",
    "/path/to/cache",
    "1",
    "/path/to/cache", // can be same as the engine cache folder
};

Ort::ThrowOnError(api.UpdateTensorRTProviderOptions(tensorrt_options,
                                                    option_keys.data(), option_values.data(), option_keys.size()));


cudaStream_t cuda_stream;
cudaStreamCreate(&cuda_stream);
// this implicitly sets "has_user_compute_stream"
Ort::ThrowOnError(api.UpdateTensorRTProviderOptionsWithValue(cuda_options, "user_compute_stream", cuda_stream))

session_options.AppendExecutionProvider_TensorRT_V2(*tensorrt_options);
/// below code can be used to print all options
OrtAllocator* allocator;
char* options;
Ort::ThrowOnError(api.GetAllocatorWithDefaultOptions(&allocator));
Ort::ThrowOnError(api.GetTensorRTProviderOptionsAsString(tensorrt_options,          allocator, &options));

Page / URL

https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html

@axbycc-mark axbycc-mark added the documentation improvements or additions to documentation; typically submitted using template label Mar 5, 2025
@axbycc-mark
Copy link
Author

I think the char* options is also a memory leak at the every end. It needs allocator->Free(allocator, options)

@yuslepukhin
Copy link
Member

Contributions are welcome. Use C++ instead of C, and if C++ API is lacking, let us know.

@axbycc-mark
Copy link
Author

It's unclear how to use C++ here, since the official documentation is mixing the C and C++ API. I looked into the C++ API and there does not seem to be a wrapper type for the TensorRTProviderOptions.

I guess the real fix is to update the C++ API and then update the documentation after that? We could have an Ort::TensorRTProviderOptions which wraps the OrtTensorRTProviderOptions_v2 * and automatically frees itself in the destructor, as well has having member function to add option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation improvements or additions to documentation; typically submitted using template
Projects
None yet
Development

No branches or pull requests

2 participants