Skip to content

Commit 4c34b9e

Browse files
inexorabletashfdwrhuningxin
authored
Clarify interpolation algorithms for resample2d (#816)
* Clarify interpolation algorithms for resample2d This gives formal definitions for the `nearest-neighbor` and `linear` interpolation modes. The definitions are based on text given by @fdwr and baseline implementation by @BruceDai and independently verified. Resolves #358 * Update index.bs Co-authored-by: Dwayne Robinson <[email protected]> * Update index.bs Co-authored-by: Dwayne Robinson <[email protected]> * Update index.bs Co-authored-by: Dwayne Robinson <[email protected]> * Update index.bs Co-authored-by: Ningxin Hu <[email protected]> * Update index.bs Co-authored-by: Ningxin Hu <[email protected]> * Update index.bs Co-authored-by: Ningxin Hu <[email protected]> * Incorporate review feedback * Remove parenthetical --------- Co-authored-by: Dwayne Robinson <[email protected]> Co-authored-by: Ningxin Hu <[email protected]>
1 parent 9c00304 commit 4c34b9e

File tree

1 file changed

+66
-0
lines changed

1 file changed

+66
-0
lines changed

index.bs

+66
Original file line numberDiff line numberDiff line change
@@ -7134,6 +7134,45 @@ partial dictionary MLOpSupportLimits {
71347134
::
71357135
The interpolation algorithm used to fill the output tensor values.
71367136

7137+
Both algorithms start with these inputs, computed for each spatial axis (based on {{MLResample2dOptions/axes}}), where `inputSize` is given by the {{MLGraphBuilder/resample2d(input, options)/input}} tensor's [=MLTensor/shape=], `outputSize` is given by {{MLResample2dOptions/sizes}} or {{MLResample2dOptions/scales}}, and `outputCoordinate` identifies the element in the output tensor being computed.
7138+
```
7139+
scale = outputSize / inputSize
7140+
unclampedCoordinate = (outputCoordinate + 0.5) / scale - 0.5
7141+
inputCoordinate = clamp(unclampedCoordinate, 0, inputSize - 1)
7142+
```
7143+
For a given `outputCoordinate.x` and `outputCoordinate.y` location in the output tensor, the above equations give a rational `inputCoordinate.x` and `inputCoordinate.y`.
7144+
7145+
<dl dfn-type=enum-value dfn-for=MLInterpolationMode>
7146+
: <dfn>nearest-neighbor</dfn>
7147+
::
7148+
The `inputCoordinate.x` and `inputCoordinate.y` computed above are used as inputs to a nearest-neighbor sampling algorithm to compute the output tensor value as follows:
7149+
```
7150+
x = ceil(inputCoordinate.x - 0.5)
7151+
y = ceil(inputCoordinate.y - 0.5)
7152+
output tensor value = input tensor value at (x, y)
7153+
```
7154+
7155+
: <dfn>linear</dfn>
7156+
::
7157+
The `inputCoordinate.x` and `inputCoordinate.y` computed above are used as inputs to a bilinear sampling algorithm to compute the output tensor value as follows:
7158+
```
7159+
x0 = floor(inputCoordinate.x)
7160+
x1 = ceil(inputCoordinate.x)
7161+
y0 = floor(inputCoordinate.y)
7162+
y1 = ceil(inputCoordinate.y)
7163+
vx0y0 = input tensor value at (x0, y0)
7164+
vx1y0 = input tensor value at (x1, y0)
7165+
vx0y1 = input tensor value at (x0, y1)
7166+
vx1y1 = input tensor value at (x1, y1)
7167+
tx = inputCoordinate.x - x0
7168+
ty = inputCoordinate.y - y0
7169+
7170+
vy0 = vx0y0 * (1 - tx) + vx1y0 * tx
7171+
vy1 = vx0y1 * (1 - tx) + vx1y1 * tx
7172+
output tensor value = vy0 * (1 - ty) + vy1 * ty
7173+
```
7174+
</dl>
7175+
71377176
: <dfn>scales</dfn>
71387177
::
71397178
A list of length 2.
@@ -7224,6 +7263,33 @@ partial dictionary MLOpSupportLimits {
72247263
1. Return |output|.
72257264
</details>
72267265

7266+
7267+
<div class="note">
7268+
The specific sampling algorithms are based on those widely used in existing Machine Learning frameworks. For example, when performing {{MLInterpolationMode/linear}} resampling from the following *[4, 4]* input tensor (considering only spatial dimensions):
7269+
7270+
```
7271+
[ 0 1 2 3 ]
7272+
[ 0 1 2 3 ]
7273+
[ 12 13 14 15 ]
7274+
[ 12 13 14 15 ]
7275+
```
7276+
7277+
For an *[8, 8]* output tensor, the expected values are:
7278+
7279+
```
7280+
[ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ]
7281+
[ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ]
7282+
[ 0 0.25 0.75 1.25 1.75 2.25 2.75 3 ]
7283+
[ 3 3.25 3.75 4.25 4.75 5.25 5.75 6 ]
7284+
[ 9 9.25 9.75 10.25 10.75 11.25 11.75 12 ]
7285+
[ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ]
7286+
[ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ]
7287+
[ 12 12.25 12.75 13.25 13.75 14.25 14.75 15 ]
7288+
```
7289+
7290+
This has the convenient properties that the sampling is evenly distributed, symmetric, robust to image mirroring, and the corner values are aligned.
7291+
</div>
7292+
72277293
### reshape ### {#api-mlgraphbuilder-reshape-method}
72287294
Alter the shape of a tensor to a new shape. Reshape does not copy or change the content of the tensor. It just changes the tensor's logical shape for the subsequent operations.
72297295
<script type=idl>

0 commit comments

Comments
 (0)