-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spell out where views are allowed #921
Comments
|
Practical example from scipy: The above code is fine in all the backends we know of. But a backend could have max return |
This won't fly, since views aren't a concept in standard. There really is no way to fix this problem in the standard, the only way to do it is (a) fix bugs in libraries like the (b) is quite desirable, it's in the works for PyTorch and I hope that will actually materialize at some point. For NumPy we've brainstormed about it a bit recently, since it's also desirable for thread-safety - which is becoming much more relevant with free-threading. Also pragmatically: even if we did in the standard what you suggest, libraries aren't going to follow that and do a whole bunch of work to audit everything and make changes to how functions behave (which would all be bc-breaking changes anyway, I think it's a nonstarter for anything that's not considered a bug).
|
With this, do you mean something like That would indeed solve the scipy example I posted: xp = array_namespace(x)
x = xp.asarray(x, writable=False) # NOTE THIS!
x_max = xp.max(x, axis=axis, keepdims=True)
if x_max.ndim > 0:
x_max = xpx.at(x_max, ~xp.isfinite(x_max)).set(0) With this change,
|
Kinda, but without a >>> import numpy as np
>>> x = np.arange(5)
>>> y = x[::2]
>>> y.data is x.data
False
>>> y.base
array([0, 1, 2, 3, 4])
>>> y.base is x
True
>>> x.base
>>> y[0] += 1 # you, and numpy, can be sure this modifies >1 arrays (because of .base)
>>> x[0] += 1 # you, and numpy, cannot know if this modifies 1 or >1 arrays Once you can always know, it's straightforward to implement modes (context manager, global setting, etc.) where in-place operations either raise or do copy-on-write if the operation affects >1 array. And over time even migrate the default possibly. The harder part is implementing the machinery for this. |
In https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html, the Standard says
The above is fine after
__getitem__
,asarray(..., copy=None)
,astype(..., copy=False)
, and similar functions that are explicitly explained by the standard to potentially return views.However, there are a few corner cases where views could be possible but a normal user is very unlikely to think about them.
I just stumbled on one in data-apis/array-api-compat#298, where
array_api_compat.torch.sum(x, dtype=x.dtype, axis=())
was accidentally returningx
instead of a copy of it.There are a few more cases where a library could try to be smart; for example
min
,max
, other?) could return a view to the minimum/maximum pointminimum
,maximum
,clip
,where
) could return one of the input arrays when there is nothing to do__add__
/__sub__
vs. 0,__mul__
/__div__
vs. 1, etc.)In real life, I expect end users to assume that the above functions will always return a copy.
I think the standard should spell this out, limiting the possibily of views to an explicit list of allowed functions:
__getitem__
asarray
astype
__dlpack__
from_dlpack
reshape
broacast_to
broadcast_arrays
The text was updated successfully, but these errors were encountered: