Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove MLCommandEncoder #546

Merged
merged 2 commits into from
Feb 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
157 changes: 6 additions & 151 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -669,7 +669,7 @@ Note: The group is <a href="https://github.com/webmachinelearning/webnn/issues/8

Unlike WebGPU, this API does not intrinsically support custom shader authoring; and as a result is not prone to timing attacks that rely on shader caches, or other persistent data. The API builds upon pre-existing shaders and lower level primitives of the browser or the underlying OS. Web developers who interface with {{GPUDevice}} are expected to be aware of <a href="https://gpuweb.github.io/gpuweb/#privacy-user-agent-state">WebGPU compilation cache considerations</a>.

The WebGPU API identifies <a href="https://gpuweb.github.io/gpuweb/#privacy-machine-artifacts">machine-specific artifacts</a> as a privacy consideration. Given the WebNN API defines means to record an ML workload onto a WebGPU-compatible {{GPUCommandBuffer}}, compute unit scheduling may under certain circumstances introduce a fingerprint. However, similarly to WebGPU, such fingerprints are identical across most or all of the devices of each vendor, mitigating the concern. Furthermore, software implementations can be used to further eliminate such artifacts.
The WebGPU API identifies <a href="https://gpuweb.github.io/gpuweb/#privacy-machine-artifacts">machine-specific artifacts</a> as a privacy consideration. Similarly, the WebNN API's compute unit scheduling may under certain circumstances introduce a fingerprint. However, similarly to WebGPU, such fingerprints are identical across most or all of the devices of each vendor, mitigating the concern. Furthermore, software implementations can be used to further eliminate such artifacts.

The WebNN API defines two developer-settable preferences to help inform [[#programming-model-device-selection]] and allow the implementation to better select the most appropriate underlying execution device for the workload. [=Device type=] normatively indicates the kind of device and is either {{MLDeviceType/"cpu"}} or {{MLDeviceType/"gpu"}}. If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown, thus this type can in some cases add two bits of entropy to the fingerprint. [=Power preference=] indicates preference as related to the power consumption and is considered a hint only and as such does not increase entropy of the fingerprint.

Expand Down Expand Up @@ -743,13 +743,6 @@ In both the {{MLContext}}.{{MLContext/compute()}} and {{MLContext}}.{{MLContext/
the input values using {{MLNamedArrayBufferViews}}, binding the input {{MLOperand}}s to their values. The caller
then supplies pre-allocated buffers for output {{MLOperand}}s using {{MLNamedArrayBufferViews}}.

The {{MLCommandEncoder}} interface created by the {{MLContext}}.{{MLContext/createCommandEncoder()}} method supports
a graph execution method that provides the maximum flexibility to callers that also utilize WebGPU in their
application. It does this by placing the workload required to initialize and compute the results of the
operations in the graph onto a {{GPUCommandBuffer}}. The callers are responsible for the eventual submission
of this workload on the {{GPUQueue}} through the WebGPU queue submission mechanism. Once the submitted workload
is completely executed, the result is avaialble in the bound output buffers.

## Device Selection ## {#programming-model-device-selection}

An {{MLContext}} interface represents a global state of neural network execution. One of the important context states is the underlying execution device that manages the resources and facilitates the compilation and the eventual execution of the neural network graph. In addition to the default method of creation with {{MLContextOptions}}, an {{MLContext}} could also be created from a specific {{GPUDevice}} that is already in use by the application, in which case the corresponding {{GPUBuffer}} resources used as graph constants, as well as the {{GPUTexture}} as graph inputs must also be created from the same device. In a multi-adapter configuration, the device used for {{MLContext}} must be created from the same adapter as the device used to allocate the resources referenced in the graph.
Expand Down Expand Up @@ -947,135 +940,6 @@ The {{MLActivation}} objects (including the ones passed as input to methods) are
</div>
</details>

## {{MLCommandEncoder}} interface ## {#api-mlcommandencoder}
The {{MLCommandEncoder}} interface represents a method of execution that synchronously records the computational workload of a compiled {{MLGraph}} to a {{GPUCommandBuffer}} on the calling thread. Since the workload is not immediately executed, just recorded, this method allows more flexibility for the caller to determine how and when the recorded commands will be submitted for execution on the GPU relative to other GPU workload on the same or different queue.

<script type=idl>
typedef (GPUBuffer or GPUTexture) MLGPUResource;

typedef record<DOMString, MLGPUResource> MLNamedGPUResources;

[SecureContext, Exposed=(Window, DedicatedWorker)]
interface MLCommandEncoder {};
</script>

<div class=internal-slots>
{{MLCommandEncoder}} has the following internal slots:
<dl dfn-type=attribute dfn-for="MLCommandEncoder">
: <dfn>\[[context]]</dfn> of type {{MLContext}}
::
The context of type {{MLContext}} associated with this {{MLCommandEncoder}}.

: <dfn>\[[implementation]]</dfn>
::
The underlying implementation provided by the User Agent.
</dl>
</div>

### Graph Initialization ### {#api-mlcommandencoder-graph-initialization}
Record the initialization of the {{MLGraph}}. This is a necessary step for optimal performance during graph execution as it gives the platform an opportunity to prepare and optimize constant input data for the subsequent execution of the graph. This method should only be called once per graph.

<script type=idl>
partial interface MLCommandEncoder {
undefined initializeGraph(MLGraph graph);
};
</script>

<div>
**Arguments:**
- *graph*: an {{MLGraph}}. The compiled graph to be initialized with graph constant inputs.

**Returns:** {{undefined}}.
</div>

<details open algorithm>
<summary>
The <dfn method for=MLCommandEncoder>initializeGraph(<var ignore>graph</var>)</dfn> method steps are:
</summary>
<div>
<div class="note">
Graph initialization stage typically involves a process known as "weight preprocessing" where all the constant inputs to the graph are preprocessed and cached at the operating system level for subsequent graph execution calls. The initializing inputs are typically the constant weight data specified through the {{MLGraphBuilder/constant(descriptor, bufferView)|MLGraphBuilder/constant(value, type)}} method as constant operands during graph construction time.
</div>
</div>
</details>

### Dispatch Execution Commands ### {#api-mlcommandencoder-dispatch-commands}
Record the {{MLGraph}} execution with the inputs {{MLNamedGPUResources}} and outputs {{MLNamedGPUResources}}.

<script type=idl>
partial interface MLCommandEncoder {
undefined dispatch(MLGraph graph, MLNamedGPUResources inputs, MLNamedGPUResources outputs);
};
</script>

<div>
**Arguments:**
- *graph*: an {{MLGraph}}. The compiled graph to be executed.
- *inputs*: an {{MLNamedGPUResources}}. The resources of inputs.
- *outputs*: an {{MLNamedGPUResources}}. The pre-allocated resources of required outputs.

**Returns:** {{undefined}}.
</div>

<details open algorithm>
<summary>
The <dfn method for=MLCommandEncoder>dispatch(|graph|, |inputs|, |outputs|)</dfn> method steps are:
</summary>
<div class=algorithm-steps>
1. If any of the following requirements are unmet, then [=exception/throw=] a "{{DataError}}" {{DOMException}}.
<div class=validusage>
1. [=map/For each=] |name| &rarr; |input| of |inputs|:
1. |graph|.{{MLGraph/[[inputDescriptors]]}}[|name|] must [=map/exist=].
1. Let |inputDesc| be |graph|.{{MLGraph/[[inputDescriptors]]}}[|name|].
1. If |input| is a {{GPUBuffer}}, then:
1. |input|.{{GPUBuffer/size}} must equal to [=byte length=] of |inputDesc|.
1. [=map/For each=] |name| &rarr; |output| of |outputs|:
1. |graph|.{{MLGraph/[[outputDescriptors]]}}[|name|] must [=map/exist=].
1. Let |outputDesc| be |graph|.{{MLGraph/[[outputDescriptors]]}}[|name|].
1. If |output| is a {{GPUBuffer}}, then:
1. |output|.{{GPUBuffer/size}} must equal to [=byte length=] of |outputDesc|.
</div>
1. [=map/For each=] |name| &rarr; |input| of |inputs|:
1. Set the input of |graph|.{{MLGraph/[[implementation]]}} that is associated with |name| to |input|.
1. [=map/For each=] |name| &rarr; |output| of |outputs|:
1. Set the output of |graph|.{{MLGraph/[[implementation]]}} that is associated with |name| to |output|.
1. Issue a compute request of |graph|.{{MLGraph/[[implementation]]}}.
1. If there is an error returned by |graph|.{{MLGraph/[[implementation]]}}, then:
1. Throw an "{{OperationError}}" {{DOMException}}.
1. Return {{undefined}}.
</div>
</details>

### Generate GPU Command Buffer ### {#api-mlcommandencoder-generate-gpu-command-buffer}
Complete the recording of ML workload and return a WebGPU-compatible {{GPUCommandBuffer}} containing the recorded workload.

<script type=idl>
partial interface MLCommandEncoder {
GPUCommandBuffer finish(optional GPUCommandBufferDescriptor descriptor = {});
};
</script>

<div>
**Arguments:**
- *descriptor*: an optional {{GPUCommandBufferDescriptor}}. Descriptor of the command buffer.

**Returns:** {{GPUCommandBuffer}}.
</div>

<details open algorithm>
<summary>
The <dfn method for=MLCommandEncoder>finish(|descriptor|)</dfn> method steps are:
</summary>
<div class=algorithm-steps>
1. If any of the following sub-steps fail, [=exception/throw=] an "{{OperationError}}" {{DOMException}}.
1. Make a request to the underlying platform to complete the recording of the ML workload, given |descriptor|.
<div class="note">
See the related <a href="https://www.w3.org/TR/webgpu/#dom-gpucommandencoder-finish">WebGPU steps</a>.
</div>
1. Return a {{GPUCommandBuffer}} containing the recorded workload.
</div>
</details>

## {{MLContext}} interface ## {#api-mlcontext}
The {{MLContext}} interface represents a global state of neural network compute workload and execution processes. Each {{MLContext}} object has associated [=context type=], [=device type=] and [=power preference=].

Expand Down Expand Up @@ -1352,19 +1216,6 @@ partial interface MLContext {
</details>
</div>

### WebGPU Interoperability ### {#api-mlcontext-webgpu-interop}
Create {{MLCommandEncoder}} interface used to record the ML workload onto a WebGPU-compatible {{GPUCommandBuffer}} to allow mixing of ML workload with other GPU workload in an application that leverages WebGPU. This method only succeeds on an {{MLContext}} created with {{GPUDevice}}. Otherwise, it [=exception/throws=] an "{{OperationError}}" {{DOMException}}.

<script type=idl>
partial interface MLContext {
MLCommandEncoder createCommandEncoder();
};
</script>

<div algorithm=mlcontext.createcommandencoder>
**Returns:** {{MLCommandEncoder}}. The command encoder used to record ML workload on the GPU.
</div>

## {{MLGraph}} interface ## {#api-mlgraph}
The {{MLGraph}} interface represents a compiled computational graph. A compiled graph once constructed is immutable and cannot be subsequently changed.

Expand Down Expand Up @@ -1433,7 +1284,9 @@ interface MLGraphBuilder {
</script>

<div class="note">
Both {{MLGraphBuilder}}.{{MLGraphBuilder/build()}} and {{MLGraphBuilder}}.{{MLGraphBuilder/buildSync()}} methods compile the graph builder state up to the specified output operands into a compiled graph according to the type of {{MLContext}} that creates it. Since this operation can be costly in some machine configurations, the calling thread of the {{MLGraphBuilder}}.{{MLGraphBuilder/buildSync()}} method must only be a worker thread to avoid potential disruption of the user experience. When the {{MLContext/[[contextType]]}} of the {{MLContext}} is set to "[=context type/default=]", the compiled graph is initialized right before the {{MLGraph}} is returned. This graph initialization stage is important for optimal performance of the subsequent graph executions. See [[#api-mlcommandencoder-graph-initialization]] for more detail.
Both {{MLGraphBuilder}}.{{MLGraphBuilder/build()}} and {{MLGraphBuilder}}.{{MLGraphBuilder/buildSync()}} methods compile the graph builder state up to the specified output operands into a compiled graph according to the type of {{MLContext}} that creates it. Since this operation can be costly in some machine configurations, the calling thread of the {{MLGraphBuilder}}.{{MLGraphBuilder/buildSync()}} method must only be a worker thread to avoid potential disruption of the user experience. When the {{MLContext/[[contextType]]}} of the {{MLContext}} is set to "[=context type/default=]", the compiled graph is initialized right before the {{MLGraph}} is returned. This graph initialization stage is important for optimal performance of the subsequent graph executions. It typically involves a process known as "weight preprocessing" where all the constant inputs to the graph are preprocessed and cached at the operating system level for subsequent graph execution calls. The initializing inputs are typically the constant weight data specified through the {{MLGraphBuilder/constant(descriptor, bufferView)|MLGraphBuilder/constant(value, type)}} method as constant operands during graph construction time.

Issue(552): Decide how to specify graph initialization.
</div>

{{MLBufferResourceView}} has the following members:
Expand Down Expand Up @@ -1700,6 +1553,8 @@ Build a composed graph up to a given output operand into a computational graph,
1. Implementations MAY preprocess and optimize the tensor data of |operand| for the underlying platform.
1. Register |operand|.{{MLOperand/[[operand]]}} in |graphImpl| as graph output.
1. Register |operand|.{{MLOperand/[[operator]]}} to |graphImpl|.

Issue(552): Decide how to specify graph initialization.
1. Return |graph|.
</div>
</details>
Expand Down
Loading