-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Description
[2.0] Proposal: A standard policy for vector dimension mismatches
The goal: A consistent policy for vector operations
With the introduction of
A core concept in these libraries is broadcasting, a standard set of rules for handling operations between vectors, matrices, or tensors of different dimensions. These rules are used everywhere from math libraries like math.js, to machine-learning libraries like TensorFlow.js.
This issue proposes that p5.js adopt the standard broadcasting rules to ensure our math library is consistent, predictable, and extensible for future features like p5.Matrix
. Let's explore what that means.
How broadcasting works
For now, it will help to consider how broadcasting works in the special case of vectors. Here, broadcasting tells us that two vectors can be operated on if they have matching dimensions, or if one of the vectors is 1D. That's it. Let's look at some examples of why these rules are so useful.
Addition and subtraction:
createVector(10, 10, 10).add(2)
produces components [12, 12, 12]
The 2
in the 1D vector [2]
is broadcast to higher dimensions, so that we're really adding [10, 10, 10]
and [2, 2, 2]
. This is a useful operation in data processing, statistics, etc. For example, given a list of exam grades like [92, 83, 61, 97, 72, 75, 64, 95, 100, 82]
, we can center it around zero by subtracting the average (82.1) from every number in the list. This makes it clear which scores are below average and which are above average.
Multiplication and division:
createVector(10, 10, 10).mult(2)
produces components [20, 20, 20]
As before, the 1D vector [2]
becomes [2, 2, 2]
, and the operation is applied elementwise. This isn't just predictable. It's also very useful. This is scalar multiplication: the vector is scaled by 2, making it twice as long.
The problem: Current behavior is inconsistent
Currently, p5.js does not follow standard broadcasting rules, which creates several problems:
- Internal inconsistency: p5.js is inconsistent with itself—
mult(2)
applies to all components (correct), butadd(2)
applies only to the first component. It offers no such shortcut for the second component. - External inconsistency: This behavior is different from every major math and ML library, making p5.js less of an onramp, and forcing users to unlearn p5's rules to advance.
- Confusing padding rules: For multi-element multipliers like in
createVector(1, 1, 1).mult(2, 2)
, v1.x pads the missing component with 1 (resulting in[2, 2, 1]
), which is an unpredictable special case. Users might guess[2, 2, 2]
.
How could p5 work? The options.
As we stabilize the p5.Vector
feature set for p5.js 2.0, we have the opportunity to reassess the rules that p5 should follow. Several options have been described by @limzykenneth:
- Refuse to operate on incompatible vectors (ie. throwing an error when this is tried)
- Perform broadcasting where possible and refuse to operate thereafter
- Automatically convert all vectors to the highest common dimension with 0 padding before operating
- Some combination of the above
Weighing the options: Options 2 and 4 seem most viable
The first option is likely not viable, as it would disallow common operations like scalar multiplication. That leaves Options 2, 3, and 4. Option 3 introduces additional forms of complexity, as noted previously, and it goes against the original reason for introducing
The trouble with Option 4
For Option 4, it seems sensible to at least follow standard broadcasting rules when one of the vectors is 1D. Then the question is, how do we handle mismatches where neither of the vectors is 1D? If we look at a concrete example, we start to see how confusing it might be. In the example below, there is no obvious way to proceed, and users are left guessing.
Example: createVector(2, 3).mult(4, 5, 6)
Do we extend [2, 3]
to [2, 3, 0]
, since that's the most natural way to extend a 2D vector to a 3D vector?
Do we extend [2, 3]
to [2, 3, 3]
, extending the broadcasting approach by repeating the last entry?
Do we extend [2, 3]
to [2, 3, 1]
, so that only the first two components of [4, 5, 6]
are changed?
Does the user want the vector [2, 3]
to turn into a 3D vector at all?
This is just one simple vector example. If we consider matrices or tensors, the situation may become more complicated.
Proposal: Adopt Standard Broadcasting (Option 2)
Based on the analysis above, I propose that p5.js adopt the standard, widely-used broadcasting rules:
- Operations are allowed if vector dimensions match.
- Operations are allowed if one operand is a scalar (a 1D vector or a single number).
- All other dimension mismatches will throw an error.
This approach is simple, consistent, and avoids the ambiguity of custom padding rules (as shown in the "trouble with Option 4" example). It also aligns with the original motivation for p5.Matrix
and even p5.Tensor
.
This would be a breaking change for some sketches, but major releases are the appropriate time to fix confusing or inconsistent APIs.
Discussion
What do you think, everyone? How would you handle dimension mismatches? Are there any use cases I didn't cover that you think are important?
Invitation for comment
Many other community members have been actively involved in related discussions, and I'd love to hear their thoughts. These include @ksen0, @limzykenneth, @inaridarkfox4231, @sidwellr, @Ahmed-Armaan, @davepagurek, @holomorfo, @nickmcintyre, @RandomGamingDev, and many others. Everyone is welcome to share their ideas!
Metadata
Metadata
Assignees
Type
Projects
Status