Drop `.~` syntax entirely #825

yebai · 2025-02-28T10:33:14Z

The .~ syntax allows one to define parameters in a vectorised manner:

x = Vector(undef, 4) 
x .~ Distribution()

#804 introduced extra constraints on the RHS: it has to be univariate distributions. This makes the need for the broadcasting syntax . questionable: it is much less flexible than Julia's broadcasting. So, using the broadcasting .~ might be confusing actually. I think we should further simplify vectorised tilde into

x = Vector(undef, 4) 
x ~ UnivariateDistribution()

This is semantically equivalent to x ~ filldist(UnivariateDistribution(), 4) but with a cleaner syntax. It also provides a clean syntax for supporting (univariate) identically-independently-distributed (IID) distributions discussed previously.

This proposal is also consistent with the Stan syntax for vectorised random variables.

The text was updated successfully, but these errors were encountered:

mhauru · 2025-02-28T11:00:19Z

I would find it semantically confusing to say that a vector is distributed according to a univariate distribution. This also goes strongly against the convention of Julia where you have to use broadcasted . operations to e.g. apply binary operations with a scalar element-wise, or assign elements of a vector as in a = Vector(undef, 4); a .= 0.0. Many other languages allow things like randn(3) + 1.0 but Julia has made the explicit decision to forbid it, and I think it's been a good decision. I don't think the . in .~ complicates our interface much, and it does keep the distinction between arrays/scalars and univariates/multivariates more clear and much more in line with the rest of Julia.

I appreciate that our .~ doesn't implement the full power of usual Julia broadcasting with ., but it does implement a subset of the functionality one would expect from ., and when it works it behaves consistently with Julia's standard . broadcasting. In other cases it errors in a helpful manner. I think this is preferable to having a syntax that goes against the conventions of how broadcasting and mixing scalars and vectors works in Julia.

penelopeysm · 2025-02-28T11:42:50Z

I'm not opposed to the syntax change, but I'm unsure about how to implement this because at macro time we can't determine whether we need to use filldist or not

@model function f()
    y ~ Normal()
    x = Array{Float64}(undef, 2, 3)
    x ~ Normal()
end

Here we don't want to use filldist for y, but we want to expand the x to filldist(Normal(), 2, 3), and I don't immediately see how we can differentiate between this at macro time. One possible solution would be to always use filldist - i.e. for y we would expand to filldist(Normal(), 1) - but that might be undesirable from a performance point of view.

.~ lets us escape this ambiguity because we assume that anything on the lhs of .~ needs filldist and anything that isn't doesn't need it.

I haven't thought about it super deeply yet though

penelopeysm · 2025-02-28T11:44:17Z

If we had @parameters struct that might be doable ;D

yebai · 2025-02-28T13:17:27Z

We could consider contributing a new iid_distribution(dist::UnivariateDistribution, N::Int) to the Distributions.jl package or keep it inside DynamicPPL initially. Then, we can have

y ~ Normal()  # standard univariate RV
x ~ iid_distribution(Normal(), 2, 3)  # similar to `.~`

This would allow us to replace .~ with iid_distribution and customise condition / fix / etc model operations for iid_distribution.

torfjelde · 2025-02-28T13:29:27Z

I appreciate that our .~ doesn't implement the full power of usual Julia broadcasting with ., but it does implement a subset of the functionality one would expect from ., and when it works it behaves consistently with Julia's standard . broadcasting.

Not particularly relevant to the full discussion, but to add to your commeont here @mhauru: it's also worth noting that some things that would make sense if we followed broadcasting semantics in Julia fully doesn't quite make sense in a @model.

An example is

    x = Matrix(undef, 2, 1)
    x .~ reshape(fill(Normal(), 2), 1, 2)

This would technically work in Turing, but results in sampling both of the components of x twice and accumulating that into logp. Simiarly during logp-evaluation, you'd double-count x[1, 1] and x[2, 1] in your logp.

yebai mentioned this issue Feb 28, 2025

Change .~ to use filldist rather than a loop #824

Closed

penelopeysm pinned this issue Feb 28, 2025

penelopeysm unpinned this issue Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Drop `.~` syntax entirely #825

Drop `.~` syntax entirely #825

yebai commented Feb 28, 2025 •

edited

Loading

mhauru commented Feb 28, 2025

penelopeysm commented Feb 28, 2025 •

edited

Loading

penelopeysm commented Feb 28, 2025

yebai commented Feb 28, 2025 •

edited

Loading

torfjelde commented Feb 28, 2025

Drop .~ syntax entirely #825

Drop .~ syntax entirely #825

Comments

yebai commented Feb 28, 2025 • edited Loading

mhauru commented Feb 28, 2025

penelopeysm commented Feb 28, 2025 • edited Loading

penelopeysm commented Feb 28, 2025

yebai commented Feb 28, 2025 • edited Loading

torfjelde commented Feb 28, 2025

Drop `.~` syntax entirely #825

Drop `.~` syntax entirely #825

yebai commented Feb 28, 2025 •

edited

Loading

penelopeysm commented Feb 28, 2025 •

edited

Loading

yebai commented Feb 28, 2025 •

edited

Loading