Skip to content

Don't set fast-math for the SIMD operations we set it for previously #84274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 18, 2021

Conversation

nagisa
Copy link
Member

@nagisa nagisa commented Apr 17, 2021

fast-math implies things like functions not being able to accept as an
argument or return as a result, say, inf which made these functions
confusingly named or behaving incorrectly, depending on how you
interpret it. Since the time when these intrinsics have been implemented
the intrinsics user's (stdsimd) approach has changed significantly and
so now it is required that these intrinsics operate normally rather than
in "whatever" way.

Fixes #84268

@rust-highfive
Copy link
Contributor

r? @estebank

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Apr 17, 2021
@nagisa
Copy link
Member Author

nagisa commented Apr 17, 2021

This needs changes to stdarch to build successfully. Somebody else will have to coordinate these, as stdarch is not a subtree and cannot be modified by me here.

@rust-log-analyzer

This comment has been minimized.

@nagisa
Copy link
Member Author

nagisa commented Apr 17, 2021

Another note: I'm not sure what the design philosophy of stdsimd is nowadays. I suspect that there's no longer any policy that stdsimd declares its own LLVM intrinsics and calls them directly afterwards, in which case simply removing the fast/afn for these intrinsics would likely make more sense. I doubt stdsimd cares about these intrinsics being allowed to optimize by replacing them with less accurate implementations. 🤷

@rust-log-analyzer

This comment has been minimized.

sym::simd_fma => ("fma", bx.type_func(&[vec_ty, vec_ty, vec_ty], vec_ty)),
let (intr_name, fn_ty, approx) = match name {
sym::simd_ceil => ("ceil", bx.type_func(&[vec_ty], vec_ty), false),
sym::simd_fabs => ("abs", bx.type_func(&[vec_ty], vec_ty), false),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sym::simd_fabs => ("abs", bx.type_func(&[vec_ty], vec_ty), false),
sym::simd_fabs => ("fabs", bx.type_func(&[vec_ty], vec_ty), false),

abs is for integers only.

@calebzulawski
Copy link
Member

Another note: I'm not sure what the design philosophy of stdsimd is nowadays. I suspect that there's no longer any policy that stdsimd declares its own LLVM intrinsics and calls them directly afterwards, in which case simply removing the fast/afn for these intrinsics would likely make more sense. I doubt stdsimd cares about these intrinsics being allowed to optimize by replacing them with less accurate implementations. 🤷

In stdsimd I think we definitely don't want afn, the intrinsics should optimize the same as plain f32/f64 operations (which I assume do not allow afn?)

Looking at how these intrinsics are used in stdarch (providing NEON intrinsics), fast is definitely wrong and afn probably is as well. core::arch::aarch64::vsqrt_f32 should produce the vsqrt.f32 instruction and not some approximation.

@nagisa
Copy link
Member Author

nagisa commented Apr 17, 2021

Another question is why stdsimd is no longer defining its own llvm intrinsics? And if using the Rust intrinsics is the plan forward, then a further review of all intrinsics that stdsimd depends on today is necessary, because those that have happened to exist don't necessarily do what you might intuitively expect.

@calebzulawski
Copy link
Member

There are two reasons, one is that implementing the intrinsics in the compiler allows them to be exposed as generics (the stdsimd implementation is generic over vector length). But there is also rust-lang/compiler-team#381, we want to ensure that stdsimd is also compatible with cranelift, so we do not use any LLVM intrinsics.

@calebzulawski
Copy link
Member

@workingjubilee this thread may be of interest to you...

@rust-log-analyzer

This comment has been minimized.

`fast-math` implies things like functions not being able to accept as an
argument or return as a result, say, `inf` which made these functions
confusingly named or behaving incorrectly, depending on how you
interpret it. Since the time when these intrinsics have been implemented
the intrinsics user's (stdsimd) approach has changed significantly and
so now it is required that these intrinsics operate normally rather than
in "whatever" way.

Fixes rust-lang#84268
@workingjubilee
Copy link
Member

workingjubilee commented Apr 17, 2021

Another question is why stdsimd is no longer defining its own llvm intrinsics? And if using the Rust intrinsics is the plan forward, then a further review of all intrinsics that stdsimd depends on today is necessary, because those that have happened to exist don't necessarily do what you might intuitively expect.

We have been implementing quite a lot of testing for functions, especially on edge cases and for floats, which is what caught this peculiarity, and we have been paying quite a lot of attention to the compilation outputs (both LLVM IR and assembly), so we definitely are doing that review as we go.

As Caleb mentioned, our intention is to define abstraction layers so as to factor out and thereby reduce the scope of the task from "implement all of LLVM IR" or "handle every possible machine architecture" or "worry about all possible vector lengths" as much as possible for everyone involved.

For std::simd, it might be fine if these intrinsics are a little "wiggly" in their output, I am not sure what afn entails here, but "all fast math flags" is right out since we want to handle "rare" inputs with something resembling correctness. For std::arch, in many cases they should be producing exact assembler instructions so it's not even entirely clear that "using the platform intrinsics" is acceptable.

@workingjubilee
Copy link
Member

In stdsimd I think we definitely don't want afn, the intrinsics should optimize the same as plain f32/f64 operations (which I assume do not allow afn?)

So, @calebzulawski, from what I am gathering, looking at the LangRef, afn allows a calculation function to be "eh, close-enough'd" by LLVM if the optimizer thinks that's better, and it only really applies to the FMA, calculus, and trig functions.

I don't think it would be a problem for vsqrt_f32 in practice, because LLVM should recognize (in theory) that there is an optimal instruction and always emit it. And we probably should use it for simd_fma because that lets LLVM do optimizations like this one: rust-lang/portable-simd#76 (comment)

@programmerjake
Copy link
Member

I think there should be some version of the simd functions without afn set: there are some algorithms (such as double-double arithmetic) where converting a fma to mul followed by add is plain wrong (it reduces double-double arithmetic to just slow double arithmetic, destroying half the precision).

@nagisa nagisa changed the title Set afn (approximate function) flag for simd ops Don't set fast-math for the SIMD operations we set it for previously Apr 18, 2021
@nagisa
Copy link
Member Author

nagisa commented Apr 18, 2021

This was adjusted to expose intrinsics with no flags set at all. If there's ever a need for less precise options, they will need to happen as new intrinsics. This also no longer needs any changes to sibling projects.

@nikic
Copy link
Contributor

nikic commented Apr 18, 2021

Can you please update the PR description to be in line with the updated implementation?

@bors r+ rollup=never (may cause runtime perf regressions)

@bors
Copy link
Collaborator

bors commented Apr 18, 2021

📌 Commit 487e273 has been approved by nikic

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2021
@bors
Copy link
Collaborator

bors commented Apr 18, 2021

⌛ Testing commit 487e273 with merge ef88434...

@bors
Copy link
Collaborator

bors commented Apr 18, 2021

☀️ Test successful - checks-actions
Approved by: nikic
Pushing ef88434 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Certain simd math operations imply fast-math, but operation naming does not make it apparent
10 participants