Skip to content

Commit ef073fd

Browse files
Add session and run option workload_type for applications to set efficient mode. (#21781)
### Description This PR added session and run option workload_type, this option is the knob for applications to enable/disable the processor performance efficient mode. ### Motivation and Context The efficient mode is co-engineered with processor vendors to allow applications voluntarily being serviced at a more energy efficient performance level. This functionality can be used by long running, latency insensitive application to save the energy consumption.
1 parent e952774 commit ef073fd

File tree

2 files changed

+10
-0
lines changed

2 files changed

+10
-0
lines changed

include/onnxruntime/core/session/onnxruntime_run_options_config_keys.h

+5
Original file line numberDiff line numberDiff line change
@@ -49,3 +49,8 @@ static const char* const kOrtRunOptionsConfigQnnRpcControlLatency = "qnn.rpc_con
4949
// If the value is set to -1, cuda graph capture/replay is disabled in that run.
5050
// User are not expected to set the value to 0 as it is reserved for internal use.
5151
static const char* const kOrtRunOptionsConfigCudaGraphAnnotation = "gpu_graph_id";
52+
53+
// Specify the type of workload for this run.
54+
// “Default”: OS determines the scheduling priority and processor performance to service this workload. [Default]
55+
// “Efficient”: OS treats this workload is efficiency oriented with low scheduling priority and efficient processor performance.
56+
static const char* const kOrtRunOptionsWorkloadType = "run.workload_type";

include/onnxruntime/core/session/onnxruntime_session_options_config_keys.h

+5
Original file line numberDiff line numberDiff line change
@@ -279,3 +279,8 @@ static const char* const kOrtSessionOptionsMlasGemmFastMathArm64Bfloat16 = "mlas
279279
// Refer to MatMulNBits op schema for more details.
280280
// If not provided, default is 4.
281281
static const char* const kOrtSessionOptionsQDQMatMulNBitsAccuracyLevel = "session.qdq_matmulnbits_accuracy_level";
282+
283+
// Specify the type of workload for this session.
284+
// “Default”: OS determines the scheduling priority and processor performance to service this workload. [Default]
285+
// “Efficient”: OS treats this workload is efficiency oriented with low scheduling priority and efficient processor performance.
286+
static const char* const kOrtSessionOptionsWorkloadType = "session.workload_type";

0 commit comments

Comments
 (0)