Add intrinsics for float arithmetic with `fast` flag enabled #32256

bluss · 2016-03-15T00:44:01Z

Add intrinsics for float arithmetic with fast flag enabled

fast a.k.a UnsafeAlgebra is the flag for enabling all "unsafe"
(according to llvm) float optimizations.

See LangRef for more information http://llvm.org/docs/LangRef.html#fast-math-flags

Providing these operations with less associativity rules (for example)
is useful to numerical applications.

For example, the summation loop:

let sum = 0.;
for element in data {
    sum += *element;
}

Using the default floating point semantics, this loop expresses that the
floats must be added in a sequence, one after another. This constraint
is usually completely unintended, and it means that no auto-vectorization
is possible.

bluss · 2016-03-15T00:44:12Z

Related to #21690

rust-highfive · 2016-03-15T00:44:15Z

r? @nikomatsakis

(rust_highfive has picked a reviewer for you, use r? to override)

aturon · 2016-03-15T05:37:38Z

cc @rust-lang/libs

alexcrichton · 2016-03-15T05:49:58Z

Sounds reasonable to me, and API-wise also seems fine for now!

hanna-kruppe · 2016-03-15T08:37:16Z

So IIUC this "fast" flag allows all transformation: No NaNs, no infinities, no signed zeros, reassociation, etc. C compilers typically expose more fine-grained control, allowing one to enable any subset of those assumptions (e.g., -ffinite-math-only). I don't have a concrete use case for this, but it seems like the "right" thing to do. For example I know a couple of algorithms that probably depend on infinites to correctly handle (rare but important) edge cases. On the other hand it might substantially complicate the API. Thoughts?

bluss · 2016-03-15T12:46:49Z

@rkruppe One question I have is if any of the flags except fast allow the compiler to reassociate operations. GCC too has the similar set of detailed float optimization flags (though for the whole compilation unit). They have a flag called -fassoicative-math too, but I can't see from llvm's docs if anything more than fast can do the same thing.

bluss · 2016-03-15T21:11:31Z

To answer my question, the fast (a k a UnsafeAlgebra) flag is the only way to have general operations reassociate. Using the code / documentation here. http://llvm.org/docs/doxygen/html/Reassociate_8cpp_source.html

DemiMarie · 2016-03-16T23:24:03Z

Perhaps there should be a feature request against LLVM for more fine-grained control.

alexcrichton · 2016-03-17T18:57:24Z

The libs team discussed this during triage yesterday and the conclusion was that this seems good to merge. We liked the idea of having tagged types that are "fast math" as opposed to the C mode of globally turning it on/off for now. Note that this is also quite related to the checked arithmetic story!

@bors: r+ 04d03a68ce1377b664441aaf164d052b00ee403e

bluss · 2016-03-17T19:24:06Z

Thanks. It's a good step to put this into the unstable toolbox so we can find out where using these relaxed semantics helps and where it doesn't matter.

bors · 2016-03-18T12:24:33Z

⌛ Testing commit 04d03a6 with merge 0af284c...

bors · 2016-03-18T12:29:38Z

⛄ The build was interrupted to prioritize another pull request.

bors · 2016-03-18T13:49:09Z

⌛ Testing commit 04d03a6 with merge 379d18e...

bors · 2016-03-18T13:54:57Z

⛄ The build was interrupted to prioritize another pull request.

bors · 2016-03-18T16:06:12Z

⌛ Testing commit 04d03a6 with merge 493dbab...

bors · 2016-03-18T16:21:39Z

💔 Test failed - auto-win-msvc-32-opt

`fast` a.k.a UnsafeAlgebra is the flag for enabling all "unsafe" (according to llvm) float optimizations. See LangRef for more information http://llvm.org/docs/LangRef.html#fast-math-flags Providing these operations with less precise associativity rules (for example) is useful to numerical applications. For example, the summation loop: let sum = 0.; for element in data { sum += *element; } Using the default floating point semantics, this loop expresses the floats must be added in a sequence, one after another. This constraint is usually completely unintended, and it means that no autovectorization is possible.

bluss · 2016-03-18T16:35:54Z

rebased

@bors r=alexcrichton

bors · 2016-03-18T16:35:55Z

📌 Commit 2dbac1f has been approved by alexcrichton

bors · 2016-03-19T04:12:00Z

⌛ Testing commit 2dbac1f with merge b9a93fa...

Add intrinsics for float arithmetic with `fast` flag enabled Add intrinsics for float arithmetic with `fast` flag enabled `fast` a.k.a UnsafeAlgebra is the flag for enabling all "unsafe" (according to llvm) float optimizations. See LangRef for more information http://llvm.org/docs/LangRef.html#fast-math-flags Providing these operations with less associativity rules (for example) is useful to numerical applications. For example, the summation loop: let sum = 0.; for element in data { sum += *element; } Using the default floating point semantics, this loop expresses that the floats must be added in a sequence, one after another. This constraint is usually completely unintended, and it means that no auto-vectorization is possible.

bors · 2016-03-19T06:59:11Z

Digipom · 2016-03-22T02:29:10Z

src/librustc_trans/trans/builder.rs

@@ -254,6 +263,15 @@ impl<'a, 'tcx> Builder<'a, 'tcx> {
        }
    }

+    pub fn fsub_fast(&self, lhs: ValueRef, rhs: ValueRef) -> ValueRef {
+        self.count_insn("sub");


Should this be 'fsub'?

That seems logical, this just repeats what fsub did above, both should be fixed.

maciejkula · 2016-03-22T11:34:55Z

Does this mean that passing llvm-args="-fast" to rustc will enable autovectorization of float operations?

bluss · 2016-03-22T12:05:48Z

This PR just adds intrinsics that you need to use explicitly to get for example an "fadd" instruction with the "fast" flag enabled. For example, replace a + b with unsafe { std::intrinsics::fadd_fast(a, b) }. The intrinsics support f32 and f64.

These are just intrinsics, so it's a small step towards exposing it as a stable feature in libcore/libstd eventually.

bluss · 2016-03-22T12:08:04Z

Oh, flag. They are called flags http://llvm.org/docs/LangRef.html#fast-math-flags and it's a configuration put on the instruction itself, nothing about the command line interface.

rust-highfive assigned nikomatsakis Mar 15, 2016

aturon added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Mar 15, 2016

alexcrichton removed the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Mar 17, 2016

bluss force-pushed the float-fast-math branch from 04d03a6 to 2dbac1f Compare March 18, 2016 16:34

bors merged commit 2dbac1f into rust-lang:master Mar 19, 2016

bluss deleted the float-fast-math branch March 19, 2016 08:29

Digipom reviewed Mar 22, 2016
View reviewed changes

Add intrinsics for float arithmetic with fast flag enabled #32256

Add intrinsics for float arithmetic with fast flag enabled #32256

Uh oh!

Conversation

bluss commented Mar 15, 2016

Uh oh!

bluss commented Mar 15, 2016

Uh oh!

rust-highfive commented Mar 15, 2016

Uh oh!

aturon commented Mar 15, 2016

Uh oh!

alexcrichton commented Mar 15, 2016

Uh oh!

hanna-kruppe commented Mar 15, 2016

Uh oh!

bluss commented Mar 15, 2016

Uh oh!

bluss commented Mar 15, 2016

Uh oh!

DemiMarie commented Mar 16, 2016

Uh oh!

alexcrichton commented Mar 17, 2016

Uh oh!

bluss commented Mar 17, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bluss commented Mar 18, 2016

Uh oh!

bors commented Mar 18, 2016

Uh oh!

bors commented Mar 19, 2016

Uh oh!

bors commented Mar 19, 2016

Uh oh!

Digipom Mar 22, 2016

Choose a reason for hiding this comment

Uh oh!

bluss Mar 22, 2016

Choose a reason for hiding this comment

Uh oh!

maciejkula commented Mar 22, 2016

Uh oh!

bluss commented Mar 22, 2016

Uh oh!

bluss commented Mar 22, 2016

Uh oh!

Uh oh!

Add intrinsics for float arithmetic with `fast` flag enabled #32256

Add intrinsics for float arithmetic with `fast` flag enabled #32256