|
| 1 | +# DirectML Execution Provider (Preview) |
| 2 | + |
| 3 | +DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning on Windows. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers. |
| 4 | + |
| 5 | +When used standalone, the DirectML API is a low-level DirectX 12 library and is suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications. The seamless interoperability of DirectML with Direct3D 12 as well as its low overhead and conformance across hardware makes DirectML ideal for accelerating machine learning when both high performance is desired, and the reliability and predictabiltiy of results across hardware is critical. |
| 6 | + |
| 7 | +The *DirectML Execution Provider* is an optional component of ONNX Runtime that uses DirectML to accelerate inference of ONNX models. The DirectML execution provider is capable of greatly improving evaluation time of models using commodity GPU hardware, without sacrificing broad hardware support or requiring vendor-specific extensions to be installed. |
| 8 | + |
| 9 | +The DirectML Execution Provider is currently in preview. |
| 10 | + |
| 11 | +## Table of contents |
| 12 | + |
| 13 | +- [DirectML Execution Provider (Preview)](#directml-execution-provider-preview) |
| 14 | + - [Table of contents](#table-of-contents) |
| 15 | + - [Minimum requirements](#minimum-requirements) |
| 16 | + - [Building from source](#building-from-source) |
| 17 | + - [Using the DirectML execution provider](#using-the-directml-execution-provider) |
| 18 | + - [`OrtSessionOptionsAppendExecutionProvider_DML` function](#ortsessionoptionsappendexecutionproviderdml-function) |
| 19 | + - [`OrtSessionOptionsAppendExecutionProviderEx_DML` function](#ortsessionoptionsappendexecutionproviderexdml-function) |
| 20 | + - [ONNX opset support](#onnx-opset-support) |
| 21 | + - [Multi-threading and supported session options](#multi-threading-and-supported-session-options) |
| 22 | + - [Samples](#samples) |
| 23 | + - [See also](#see-also) |
| 24 | + |
| 25 | +## Minimum requirements |
| 26 | + |
| 27 | +The DirectML execution provider requires any DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include: |
| 28 | + |
| 29 | +* NVIDIA Kepler (GTX 600 series) and above |
| 30 | +* AMD GCN 1st Gen (Radeon HD 7000 series) and above |
| 31 | +* Intel Haswell (4th-gen core) HD Integrated Graphics and above |
| 32 | + |
| 33 | +DirectML is compatible with Windows 10, version 1709 (10.0.16299; RS3, "Fall Creators Update") and newer. |
| 34 | + |
| 35 | + |
| 36 | + |
| 37 | +## Building from source |
| 38 | + |
| 39 | +For general information about building onnxruntime, see [BUILD.md](../../BUILD.md). |
| 40 | + |
| 41 | +Requirements for building the DirectML execution provider: |
| 42 | +1. Visual Studio 2017 toolchain (see [cmake configuration instructions](../../BUILD.md)) |
| 43 | +2. [The Windows 10 SDK (10.0.18362.0) for Windows 10, version 1903](https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk) (or newer) |
| 44 | + |
| 45 | +To build onnxruntime with the DML EP included, supply the `--use_dml` parameter to `build.bat`. e.g. |
| 46 | + |
| 47 | + build.bat --config RelWithDebInfo --build_shared_lib --parallel --use_dml |
| 48 | + |
| 49 | +The DirectML execution provider supports building for both x64 (default) and x86 architectures. |
| 50 | + |
| 51 | +Note that building onnxruntime with the DirectML execution provider enabled causes the the DirectML redistributable package to be automatically downloaded as part of the build. This package contains a pre-release version of DirectML, and its use is governed by a license whose text may be found as part of the NuGet package. |
| 52 | + |
| 53 | + |
| 54 | + |
| 55 | +## Using the DirectML execution provider |
| 56 | + |
| 57 | +When using the [C API](../C_API.md) with a DML-enabled build of onnxruntime (see [Building from source](#building-from-source)), the DirectML execution provider can be enabled using one of the two factory functions included in `include/onnxruntime/core/providers/dml/dml_provider_factory.h`. |
| 58 | + |
| 59 | +### `OrtSessionOptionsAppendExecutionProvider_DML` function |
| 60 | + |
| 61 | + Creates a DirectML Execution Provider which executes on the hardware adapter with the given `device_id`, also known as the adapter index. The device ID corresponds to the enumeration order of hardware adapters as given by [IDXGIFactory::EnumAdapters](https://docs.microsoft.com/windows/win32/api/dxgi/nf-dxgi-idxgifactory-enumadapters). A `device_id` of 0 always corresponds to the default adapter, which is typically the primary display GPU installed on the system. A negative `device_id` is invalid. |
| 62 | + |
| 63 | + OrtStatus* OrtSessionOptionsAppendExecutionProvider_DML( |
| 64 | + _In_ OrtSessionOptions* options, |
| 65 | + int device_id |
| 66 | + ); |
| 67 | + |
| 68 | +### `OrtSessionOptionsAppendExecutionProviderEx_DML` function |
| 69 | + |
| 70 | +Creates a DirectML Execution Provider using the given DirectML device, and which executes work on the supplied D3D12 command queue. The DirectML device and D3D12 command queue must have the same parent [ID3D12Device](https://docs.microsoft.com/windows/win32/api/d3d12/nn-d3d12-id3d12device), or an error will be returned. The D3D12 command queue must be of type `DIRECT` or `COMPUTE` (see [D3D12_COMMAND_LIST_TYPE](https://docs.microsoft.com/windows/win32/api/d3d12/ne-d3d12-d3d12_command_list_type)). If this function succeeds, the inference session once created will maintain a strong reference on both the `dml_device` and `command_queue` objects. |
| 71 | + |
| 72 | + OrtStatus* OrtSessionOptionsAppendExecutionProviderEx_DML( |
| 73 | + _In_ OrtSessionOptions* options, |
| 74 | + _In_ IDMLDevice* dml_device, |
| 75 | + _In_ ID3D12CommandQueue* cmd_queue |
| 76 | + ); |
| 77 | + |
| 78 | +**See Also** |
| 79 | + |
| 80 | +[DMLCreateDevice function](https://docs.microsoft.com/windows/win32/api/directml/nf-directml-dmlcreatedevice) |
| 81 | +[ID3D12Device::CreateCommandQueue method](https://docs.microsoft.com/windows/win32/api/d3d12/nf-d3d12-id3d12device-createcommandqueue) |
| 82 | +[Direct3D 12 programming guide](https://docs.microsoft.com/windows/win32/direct3d12/directx-12-programming-guide) |
| 83 | + |
| 84 | +### ONNX opset support |
| 85 | + |
| 86 | +The DirectML execution provider currently supports ONNX opset 9 ([ONNX v1.4](https://github.com/onnx/onnx/releases/tag/v1.4.0)). Evaluating models which require a higher opset version is not supported, and may produce unexpected results. |
| 87 | + |
| 88 | +### Multi-threading and supported session options |
| 89 | + |
| 90 | +The DirectML execution provider does not support the use of memory pattern optimizations or parallel execution in onnxruntime. When supplying session options during InferenceSession creation, these options must be disabled or an error will be returned. |
| 91 | + |
| 92 | +If using the onnxruntime C API, you must call `DisableMemPattern` and `SetSessionExecutionMode` functions to set the options required by the DirectML execution provider. |
| 93 | + |
| 94 | +See [onnxruntime\include\onnxruntime\core\session\onnxruntime_c_api.h](..\..\include\onnxruntime\core\session\onnxruntime_c_api.h). |
| 95 | + |
| 96 | + OrtStatus*(ORT_API_CALL* DisableMemPattern)(_Inout_ OrtSessionOptions* options)NO_EXCEPTION; |
| 97 | + |
| 98 | + OrtStatus*(ORT_API_CALL* SetSessionExecutionMode)(_Inout_ OrtSessionOptions* options, ExecutionMode execution_mode)NO_EXCEPTION; |
| 99 | + |
| 100 | +If creating the onnxruntime InferenceSession object directly, you must set the appropriate fields on the `onnxruntime::SessionOptions` struct. Specifically, `execution_mode` must be set to `ExecutionMode::ORT_SEQUENTIAL`, and `enable_mem_pattern` must be `false`. |
| 101 | + |
| 102 | +Additionally, as the DirectML execution provider does not support parallel execution, it does not support multi-threaded calls to `Run` on the same inference session. That is, if an inference session using the DirectML execution provider, only one thread may call `Run` at a time. Multiple threads are permitted to call `Run` simultaneously if they operate on different inference session objects. |
| 103 | + |
| 104 | +## Samples |
| 105 | + |
| 106 | +A complete sample of onnxruntime using the DirectML execution provider can be found under [samples/c_cxx/fns_candy_style_transfer](../../samples/c_cxx/fns_candy_style_transfer). |
| 107 | + |
| 108 | +## See also |
| 109 | + |
| 110 | +[DirectML documentation \(docs.microsoft.com\)](https://docs.microsoft.com/en-us/windows/win32/direct3d12/dml) |
0 commit comments