Skip to content

ndarray development discussion and dashboard #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bluss opened this issue Mar 31, 2017 · 13 comments
Closed

ndarray development discussion and dashboard #293

bluss opened this issue Mar 31, 2017 · 13 comments

Comments

@bluss
Copy link
Member

bluss commented Mar 31, 2017

@Luthaf
Copy link

Luthaf commented Apr 1, 2017

Is it possible to reconsider #152 (range dimension and negative indexing)?

As indexing and slicing uses different syntax, this should not be a breaking change, and it would really be useful for some kind of code. I think of FFT-like algorithms, where the problem space is naturally slitted over a symmetric -k .. k range.

If this is not acceptable to be in ndarray I think I'll try to implement it on top of standard arrays, but this means that we will be paying an additional overhead. I started a PoC for range-based dimensions in mudi, using a Vec<T> as storage.

@bluss
Copy link
Member Author

bluss commented Apr 1, 2017

We can consider the change when we know how to implement it. Do you have any links that detail how such a thing is implemented efficiently?

I don't understand why your implementation would have different performance implications than ndarray's?

@Luthaf
Copy link

Luthaf commented Apr 1, 2017

I know two implementation of this: the Fortran standard and Boost.MultiArray.

  • gfortran is implementing this directly in the compiler front-end, but I could not find high level documentation about this. I know a gfortran developer, I can ask him about this.
  • Boost allow this by using extend_range type, which allow to use generic ranges as index. I did not find implementation description, but the initial boost review is here. Other than that, the boost implementation can be nice to read, but looks very convoluted.

The performances implication might come from the needed translation from (n..m) to (0..n+m) for all the indexes before passing them to ndarray.

Another solution would be to do the index linearisation separately and call array::as_slice to access the underlying data, but then there would be call to is_standard_layout every time. Which also means that this could not be used for non-standard layout array (which I don't really understand).

Another solution again would be to store a pointer + dimensions and build a ndarray from this when needed to access the functionalities with a as_ndarray function. Which would correspond to re-implementing a lot of code for difference storage types.

@bluss
Copy link
Member Author

bluss commented Apr 1, 2017

It's intriguing, thanks for the links. Need to investigate how compatible it is with custom stride arrays. (To give a simple motivation for strided arrays: We want to be able to cut an array into array views that are rectangular pieces.)

(The overhead of is_standard_layout should go away at some point, arrays should carry their layout information as an additional field. For a low-dimensional array, it's not much of an overhead, it's just comparing a pair.)

@bluss
Copy link
Member Author

bluss commented Apr 7, 2017

Ok, there will be a ndarray 0.9 shortly after 0.8. I don't expect any actual breakage or difficulty with the upgrades, and the library is not really used for “interchange” much, so I think it's unproblematic. We're evidently still in exploration and development.

@jonathanstrong
Copy link

I'm using ndarray heavily for a project right now for the first time and the biggest difficulty I'm encountering is how to write functions that can handle arrays of different shapes/forms. For instance, is there a way to take an Array or an ArrayView? I'm assuming there is - but haven't been able to figure it out as of yet. Perhaps you could provide additional guidance on how to use arrays in code outside the library. Thanks!

@SuperFluffy
Copy link
Contributor

@jonathanstrong If you want to have a function that takes any sort of Array, you have to work with ArrayBase. To know what that looks like, have a look at the definition of zip_mut_with. If it was a free-standing function outside of an impl block, it would look like this:

fn zip_mut_with<A, B, S1, S2, E, F>(&mut ArrayBase<S1, E>, rhs: &ArrayBase<S2, E>, f: F) where
    S1: DataMut<Elem = A>,
    S2: Data<Elem = B>,
    E: Dimension,
    F: FnMut(&mut A, &B), 

Let's say you wanted to have a function that took a matrix and a vector, then you would use Ix1 and Ix2 instead of the two E above.

As you mentioned arrays of different shapes/forms: unfortunately, we don't have compile-time integer generics yet. So if you want to ensure that, say, a function takes two arrays of exactly the same number of elements/shape, you need to assert that at runtime.

@bluss
Copy link
Member Author

bluss commented May 20, 2017

hi @jonathanstrong, I think accepting arrays of varying ndim is the least nice part of the library right now. Is that part of what you are doing?

@jonathanstrong
Copy link

thanks for the example @SuperFluffy.

@bluss that's one pain point. The one that prompted me to write was when I looped through the rows of a matrix with iter_axis and couldn't pass the "vector" view to a function that accepted an Array1. Another was an Ix1 can dot an Ix2 but not vice versa. I also had a hard time figuring out Zip::fold_while without any examples in the docs (did pay attention to the FoldWhile enum initially). These aren't criticisms - just sharing as I know it's helpful to know how someone who isn't familiar with it experiences using it.

However, coming from a numpy/theano background this is definitely the rust math lib I feel most comfortable/productive working in. Thanks for all your hard work on it.

@SuperFluffy
Copy link
Contributor

SuperFluffy commented May 21, 2017

@jonathanstrong

The one that prompted me to write was when I looped through the rows of a matrix with iter_axis and couldn't pass the "vector" view to a function that accepted an Array1.

Did my example with the ArrayBase to have a function that can take ArrayViews help?

I recently realized why @bluss chose to introduce ArrayViews rather than working with shared Arrays: it enforces invariance of the data structures! Let's say you pass a mut_a: &mut Array to some function, then you could simply replace the mut_a by a new Array with, e.g., a different shape, causing an error somewhere down the line. However, if you pass an ArrayViewMut, then all you can do is manipulate the underlying &mut [T], keeping the original shape intact.

I suppose once compile-time integer generics land, you might be able to pass around &mut Arrays, with the type system ensuring immutability of the shape.

@bluss

Another was an Ix1 can dot an Ix2 but not vice versa.

Maybe it makes sense to do it like numpy, and implicitly assume that a Ix1-array is a row (column) vector depending on whether the M is dotted on the left (right) and implement a fn dot(ArrayBase<S, Ix1>, ArrayBase<S, Ix2>) -> Array<S, Ix1>) (dotted on the left, which we don't have) along with the fn dot(ArrayBase<S, Ix2>, ArrayBase<S, Ix1>) -> Array<S, Ix1> (dotted on the right, which we have)?

@bluss
Copy link
Member Author

bluss commented May 21, 2017

Right, it's convenient to write functions in terms of concrete types (like Array1<T>) and I do that too, and a problem that one can't then pass just a view. The function could then be written in terms of ArrayView1 instead then.

@SuperFluffy

  1. Array views have their own shape, which is mutable, which I think is cool. And that one can have an array view of a different shape or dimension than the data it is a view of.
  2. Yes that makes sense

@bluss
Copy link
Member Author

bluss commented May 21, 2017

I guess it's important for ndarray to explain that Array, ArrayView, ArrayViewMut are meant to mirror the ownership and borrowing semantics of Vec<T>, &[T], &mut [T] with only minor differences that come from the fact that [T] is a dynamically sized type.

@bluss
Copy link
Member Author

bluss commented Mar 29, 2021

This issue is superseded by discussion board (github) and matrix.

@bluss bluss closed this as completed Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants