Differences between CPU and GPU compilation regarding rounding floats #228
-
When using fn main() {
let x = 1743028480i32;
let y = (x as f64) * (1.0 / (i32::MAX as f64));
println!("{} {}", x, (y * (i32::MAX as f64)) as i32);
} This produces |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 8 replies
-
Rust:
SPIR-V:
fn main() {
let x = 1743028480i32;
let y = (x as f64) * (1.0 / (i32::MAX as f64));
// Without explicit rounding, differs:
let without = (y * (i32::MAX as f64)) as i32;
// Using `trunc()` forces Rust’s truncation:
let with = (y * (i32::MAX as f64)).trunc() as i32;
println!("x = {}. Without rounding = {}. With trunc() = {}.", x, without, with);
} |
Beta Was this translation helpful? Give feedback.
-
Oh hmm...can you upload the spir-t dump somewhere? https://github.com/Rust-GPU/rust-gpu/blob/ac0c7035d53ae0bf87fbff12cc8ad4e6f6628834/docs/src/codegen-args.md#--dump-spirt-passes-dir |
Beta Was this translation helpful? Give feedback.
-
I know @schell and @Firestar99 have done a fair amount of code running on CPU and GPU, I wonder if they have had to work around this. |
Beta Was this translation helpful? Give feedback.
In general, floating point between two different machines must not always be the same. Nowadays we generally do no-fast-math on the CPU and have the compiler emit exactly the floating point operations we write, at the expense of fewer optimizations but better accuracy across compilers and machines. But even that excludes complex fp operations such as sin or cos, which must only be accurate to some degree.
On the GPU it's an entirely different story again. The default is fast-math, optimization of fp as much as possible, merge multiply adds into FMAs and even the hardware itself may respect the rounding mode or may just not do any rounding at all, also denormals are not supported, all in t…