Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spell out where views are allowed #921

Open
crusaderky opened this issue Apr 4, 2025 · 5 comments
Open

Spell out where views are allowed #921

crusaderky opened this issue Apr 4, 2025 · 5 comments

Comments

@crusaderky
Copy link
Contributor

In https://data-apis.org/array-api/latest/design_topics/copies_views_and_mutation.html, the Standard says

Array API consumers are strongly advised to avoid any mutating operations when an array object may [...] be a “view” [...] It is not always clear, however, when a library will return a view and when it will return a copy. This standard does not attempt to specify this—libraries may do either.

The above is fine after __getitem__ , asarray(..., copy=None), astype(..., copy=False), and similar functions that are explicitly explained by the standard to potentially return views.

However, there are a few corner cases where views could be possible but a normal user is very unlikely to think about them.
I just stumbled on one in data-apis/array-api-compat#298, where array_api_compat.torch.sum(x, dtype=x.dtype, axis=()) was accidentally returning x instead of a copy of it.

There are a few more cases where a library could try to be smart; for example

  • search functions (min, max, other?) could return a view to the minimum/maximum point
  • replacement functions (minimum, maximum, clip, where) could return one of the input arrays when there is nothing to do
  • same for arithmetic functions (__add__ / __sub__ vs. 0, __mul__ / __div__ vs. 1, etc.)
  • same for sort functions when they realise the input is already sorted
  • possibly more

In real life, I expect end users to assume that the above functions will always return a copy.
I think the standard should spell this out, limiting the possibily of views to an explicit list of allowed functions:

  • __getitem__
  • asarray
  • astype
  • __dlpack__
  • from_dlpack
  • reshape
  • broacast_to
  • broadcast_arrays
  • ...more?
@crusaderky
Copy link
Contributor Author

crusaderky commented Apr 4, 2025

xpx.at(x).set(y) has a copy parameter that defaults to None. This was picked after some discussion to avoid unnecessary copies in writable libraries.
The pattern currently used in scipy is that, when the developer thinks that xpx.at may write back to the input, they need to explicitly pass copy=True. This however happens when the input is the unmodified parameter of the function, not the output of some processing/reduction on it.

@crusaderky
Copy link
Contributor Author

Practical example from scipy:

https://github.com/scipy/scipy/blob/27157ac1db4fc23bef76df50dfd8a4393453153c/scipy/special/_logsumexp.py#L397-L403

The above code is fine in all the backends we know of. But a backend could have max return x[argmax(x)], which would cause the function to write back to its input.

@rgommers
Copy link
Member

rgommers commented Apr 4, 2025

In real life, I expect end users to assume that the above functions will always return a copy. I think the standard should spell this out, limiting the possibily of views to an explicit list of allowed functions:

This won't fly, since views aren't a concept in standard. There really is no way to fix this problem in the standard, the only way to do it is (a) fix bugs in libraries like the torch.sum one (I'm fairly sure that that is indeed a bug and not a feature), and (b) for libraries to implement ways to return read-only arrays so that any user that uses in-place operations can actually tell the difference between "I'm modifying one array" vs. "I'm modifying >=2 arrays".

(b) is quite desirable, it's in the works for PyTorch and I hope that will actually materialize at some point. For NumPy we've brainstormed about it a bit recently, since it's also desirable for thread-safety - which is becoming much more relevant with free-threading.

Also pragmatically: even if we did in the standard what you suggest, libraries aren't going to follow that and do a whole bunch of work to audit everything and make changes to how functions behave (which would all be bc-breaking changes anyway, I think it's a nonstarter for anything that's not considered a bug).

...more?

linalg.diagonal is an infamous example in NumPy.

@crusaderky
Copy link
Contributor Author

(b) for libraries to implement ways to return read-only arrays so that any user that uses in-place operations can actually tell the difference between "I'm modifying one array" vs. "I'm modifying >=2 arrays".

With this, do you mean something like xp.asarray(obj, writable=False)?

That would indeed solve the scipy example I posted:

    xp = array_namespace(x)
    x = xp.asarray(x, writable=False)  # NOTE THIS!

    x_max = xp.max(x, axis=axis, keepdims=True)

    if x_max.ndim > 0:
        x_max = xpx.at(x_max, ~xp.isfinite(x_max)).set(0)

With this change, xp.max can either return

  • a writable brand new array, which causes xpx.at to efficiently write back to it;
  • or a read-only view of x, which causes xpx.at to perform a copy.

@rgommers
Copy link
Member

rgommers commented Apr 4, 2025

Kinda, but without a writable=False argument, that's too ugly. The idea is to better track of views internally, so the last comment here changes to that of the line above:

>>> import numpy as np
>>> x = np.arange(5)
>>> y = x[::2]
>>> y.data is x.data
False
>>> y.base
array([0, 1, 2, 3, 4])
>>> y.base is x
True
>>> x.base

>>> y[0] += 1  # you, and numpy, can be sure this modifies >1 arrays (because of .base)
>>> x[0] += 1  # you, and numpy, cannot know if this modifies 1 or >1 arrays

Once you can always know, it's straightforward to implement modes (context manager, global setting, etc.) where in-place operations either raise or do copy-on-write if the operation affects >1 array. And over time even migrate the default possibly. The harder part is implementing the machinery for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants