Skip to content

[2.0] Proposal: A standard policy for vector dimension mismatches #8159

@GregStanton

Description

@GregStanton

[2.0] Proposal: A standard policy for vector dimension mismatches

The goal: A consistent policy for vector operations

With the introduction of $n$-dimensional vectors in p5.js 2.0, we have an exciting opportunity to align p5.js with modern math and machine-learning libraries. This can provide an accessible, creative onramp for users to learn fundamental data-science concepts. In fact, this was the original motivation for introducing $n$-dimensional vectors to p5.

A core concept in these libraries is broadcasting, a standard set of rules for handling operations between vectors, matrices, or tensors of different dimensions. These rules are used everywhere from math libraries like math.js, to machine-learning libraries like TensorFlow.js.

This issue proposes that p5.js adopt the standard broadcasting rules to ensure our math library is consistent, predictable, and extensible for future features like p5.Matrix. Let's explore what that means.

How broadcasting works

For now, it will help to consider how broadcasting works in the special case of vectors. Here, broadcasting tells us that two vectors can be operated on if they have matching dimensions, or if one of the vectors is 1D. That's it. Let's look at some examples of why these rules are so useful.

Addition and subtraction:
createVector(10, 10, 10).add(2) produces components [12, 12, 12]

The 2 in the 1D vector [2] is broadcast to higher dimensions, so that we're really adding [10, 10, 10] and [2, 2, 2]. This is a useful operation in data processing, statistics, etc. For example, given a list of exam grades like [92, 83, 61, 97, 72, 75, 64, 95, 100, 82], we can center it around zero by subtracting the average (82.1) from every number in the list. This makes it clear which scores are below average and which are above average.

Multiplication and division:
createVector(10, 10, 10).mult(2) produces components [20, 20, 20]

As before, the 1D vector [2] becomes [2, 2, 2], and the operation is applied elementwise. This isn't just predictable. It's also very useful. This is scalar multiplication: the vector is scaled by 2, making it twice as long.

The problem: Current behavior is inconsistent

Currently, p5.js does not follow standard broadcasting rules, which creates several problems:

  • Internal inconsistency: p5.js is inconsistent with itself—mult(2) applies to all components (correct), but add(2) applies only to the first component. It offers no such shortcut for the second component.
  • External inconsistency: This behavior is different from every major math and ML library, making p5.js less of an onramp, and forcing users to unlearn p5's rules to advance.
  • Confusing padding rules: For multi-element multipliers like in createVector(1, 1, 1).mult(2, 2), v1.x pads the missing component with 1 (resulting in [2, 2, 1]), which is an unpredictable special case. Users might guess [2, 2, 2].

How could p5 work? The options.

As we stabilize the p5.Vector feature set for p5.js 2.0, we have the opportunity to reassess the rules that p5 should follow. Several options have been described by @limzykenneth:

  1. Refuse to operate on incompatible vectors (ie. throwing an error when this is tried)
  2. Perform broadcasting where possible and refuse to operate thereafter
  3. Automatically convert all vectors to the highest common dimension with 0 padding before operating
  4. Some combination of the above

Weighing the options: Options 2 and 4 seem most viable

The first option is likely not viable, as it would disallow common operations like scalar multiplication. That leaves Options 2, 3, and 4. Option 3 introduces additional forms of complexity, as noted previously, and it goes against the original reason for introducing $n$-dimensional vectors to p5, since advanced math and machine-learning libraries do not work this way. That leaves Options 2 and 4. Perhaps, in a creative-coding context, Option 4 might be useful?

The trouble with Option 4

For Option 4, it seems sensible to at least follow standard broadcasting rules when one of the vectors is 1D. Then the question is, how do we handle mismatches where neither of the vectors is 1D? If we look at a concrete example, we start to see how confusing it might be. In the example below, there is no obvious way to proceed, and users are left guessing.

Example: createVector(2, 3).mult(4, 5, 6)
Do we extend [2, 3] to [2, 3, 0], since that's the most natural way to extend a 2D vector to a 3D vector?
Do we extend [2, 3] to [2, 3, 3], extending the broadcasting approach by repeating the last entry?
Do we extend [2, 3] to [2, 3, 1], so that only the first two components of [4, 5, 6] are changed?
Does the user want the vector [2, 3] to turn into a 3D vector at all?

This is just one simple vector example. If we consider matrices or tensors, the situation may become more complicated.

Proposal: Adopt Standard Broadcasting (Option 2)

Based on the analysis above, I propose that p5.js adopt the standard, widely-used broadcasting rules:

  1. Operations are allowed if vector dimensions match.
  2. Operations are allowed if one operand is a scalar (a 1D vector or a single number).
  3. All other dimension mismatches will throw an error.

This approach is simple, consistent, and avoids the ambiguity of custom padding rules (as shown in the "trouble with Option 4" example). It also aligns with the original motivation for $n$-dimensional vectors, by preparing users for advanced math and machine learning libraries. And it ensures our API will be extensible to p5.Matrix and even p5.Tensor.

This would be a breaking change for some sketches, but major releases are the appropriate time to fix confusing or inconsistent APIs.

Discussion

What do you think, everyone? How would you handle dimension mismatches? Are there any use cases I didn't cover that you think are important?

Invitation for comment

Many other community members have been actively involved in related discussions, and I'd love to hear their thoughts. These include @ksen0, @limzykenneth, @inaridarkfox4231, @sidwellr, @Ahmed-Armaan, @davepagurek, @holomorfo, @nickmcintyre, @RandomGamingDev, and many others. Everyone is welcome to share their ideas!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions