Skip to content

Integer overflow on mean #861

Open
Open
@DanDeepPhase

Description

@DanDeepPhase

Taking the elementwise mean of a vector of Integer DimArrays overflows integer bounds

Base:

julia> M= fill(UInt16(32000), 2)
2-element Vector{UInt16}:
 0x7d00
 0x7d00

julia> mean([M, M])               # sum < maxval
2-element Vector{Float64}:
 32000.0
 32000.0

julia> mean([M, M, M])          # sum > maxval
2-element Vector{Float64}:
 32000.0
 32000.0

Whether the sum is greater or less than the integer limits, the answer is correct

Equivalent math with DimArrays:

julia> D = DimArray(M, X(1:2))
╭──────────────────────────────╮
│ 2-element DimArray{UInt16,1} │
├──────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Int64} 1:2 ForwardOrdered Regular Points
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
 1  0x7d00
 2  0x7d00

julia> mean([D, D])          # sum < maxval
╭───────────────────────────────╮
│ 2-element DimArray{Float64,1} │
├───────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Int64} 1:2 ForwardOrdered Regular Points
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
 1  32000.0
 2  32000.0

julia> mean([D, D, D])          # sum > maxval
╭───────────────────────────────╮
│ 2-element DimArray{Float64,1} │
├───────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Int64} 1:2 ForwardOrdered Regular Points
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
 1  10154.7
 2  10154.7

A different answer for mean of two 32000s and three 32000s where the maxval on the int is 65535.

Guessing at this, but is the order of operations:
Base: convert to float |> sum |> divide by N
DimensionalData: sum |> convert to float |> divide by N

taking the mean of a DimArray by itself does not have any overflow issues.

julia> D3 = DimArray(fill(UInt16(32000),3), X(1:3))
╭──────────────────────────────╮
│ 3-element DimArray{UInt16,1} │
├──────────────────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
   X Sampled{Int64} 1:3 ForwardOrdered Regular Points
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
 1  0x7d00
 2  0x7d00
 3  0x7d00

julia> mean(D3)
32000.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions