@@ -16,6 +16,9 @@ Table of Contents
16
16
* [ Run the plugin as administrator] ( #run-the-plugin-as-administrator )
17
17
* [ Verify plugin registration] ( #verify-plugin-registration )
18
18
* [ Testing the plugin] ( #testing-the-plugin )
19
+ * [ Issues with media workloads on multi-GPU setups] ( #issues-with-media-workloads-on-multi-gpu-setups )
20
+ * [ Workaround for QSV and VA-API] ( #workaround-for-qsv-and-va-api )
21
+
19
22
20
23
## Introduction
21
24
@@ -242,3 +245,64 @@ We can test the plugin is working by deploying an OpenCL image and running `clin
242
245
---- ------ ---- ---- -------
243
246
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient gpu.intel.com/i915.
244
247
```
248
+
249
+
250
+ ## Issues with media workloads on multi-GPU setups
251
+
252
+ Unlike with 3D & compute, and OneVPL media API, QSV (MediaSDK) & VA-API
253
+ media APIs do not offer device discovery functionality for applications.
254
+ There is nothing (e.g. environment variable) with which the default
255
+ device could be overridden either.
256
+
257
+ As result, most (all?) media applications using VA-API or QSV, fail to
258
+ locate the correct GPU device file unless it is the first ("renderD128")
259
+ one, or device file name is explictly specified with an application option.
260
+
261
+ Kubernetes device plugins expose only requested number of device
262
+ files, and their naming matches host device file names (for several
263
+ reasons unrelated to media). Therefore, on multi-GPU hosts, the only
264
+ GPU device file mapped to the media container can be some other one
265
+ than "renderD128", and media applications using VA-API or QSV need to
266
+ be explicitly told which one to use.
267
+
268
+ These options differ from application to application. Relevant FFmpeg
269
+ options are documented here:
270
+ * VA-API: https://trac.ffmpeg.org/wiki/Hardware/VAAPI
271
+ * QSV: https://github.com/Intel-Media-SDK/MediaSDK/wiki/FFmpeg-QSV-Multi-GPU-Selection-on-Linux
272
+
273
+
274
+ ### Workaround for QSV and VA-API
275
+
276
+ [Render device](render-device.sh) shell script locates and outputs the
277
+ correct device file name. It can be added to the container and used
278
+ to give device file name for the application.
279
+
280
+ Use it either from another script invoking the application, or
281
+ directly from the Pod YAML command line. In latter case, it can be
282
+ used either to add the device file name to the end of given command
283
+ line, like this:
284
+
285
+ ```bash
286
+ command: ["render-device.sh", "vainfo", "--display", "drm", "--device"]
287
+
288
+ => /usr/bin/vainfo --display drm --device /dev/dri/renderDXXX
289
+ ```
290
+
291
+ Or inline, like this:
292
+
293
+ ```bash
294
+ command: ["/bin/sh", "-c",
295
+ "vainfo --device $(render-device.sh 1) --display drm"
296
+ ]
297
+ ```
298
+
299
+ If device file name is needed for multiple commands, one can use shell variable:
300
+
301
+ ```bash
302
+ command: ["/bin/sh", "-c",
303
+ "dev=$(render-device.sh 1) && vainfo --device $dev && <more commands>"
304
+ ]
305
+ ```
306
+
307
+ With argument N, script outputs name of the Nth suitable GPU device
308
+ file, which can be used when more than one GPU resource was requested.
0 commit comments