-
Notifications
You must be signed in to change notification settings - Fork 287
Intrinsics get compiled even if appropriate target_features are not enabled #323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is currently intended, to allow compiling binaries that use features that aren't available at runtime (aka turns out to be dead code at runtime). It's true though that consumers who which for desired performance will need to enable the correct features as well. In any case though I think this is working as intended, so closing. |
@alexcrichton are you sure that this is intended? the AES intrinsics have |
A comment by @newpavlov in the inline assembly tracking issue seems to expand on this issue, but IMO the issue as filled its pretty much ambiguous: @newpavlov where is the problem? Are the intrinsics from If its the first case, then its a bug here. If it is the second, then that code is probably invoking undefined behavior and you are lucky that it works at all. |
This function looks like undefined behavior to me: https://github.com/RustCrypto/block-ciphers/blob/c3619695bac12e03f61a5eafd08d900d4575bfb6/aesni/src/aes128/expand.rs#L28 (if you remove the global |
I was under the wrong impression about how stdsimd supposed to work which was based on how LLVM intrinsics operate. (i.e. compilation fails if intrinsic feature is not enabled) But for runtime detection and switching it's indeed required to have ability to enable feature locally for intrinsic through |
The only mistake I see in that library is opening |
I mean for |
The problem is that the compiler cannot know on which CPUs you might intend to run your binaries. If you are always going to run the binaries on some sets of CPUs you might not need to do (and pay) for the checks. There is an RFC about allowing calling the intrinsics without
You don't need to do that. You can use |
In that case your can just use I think I've seen somewhere proposal for introducing something like this (not precise, just a coarse strawman draft): #[target_features(feature1)]
fn foo1() {
// safe to use intrinsics dependent on "feature1"
}
#[target_features(feature2)]
fn foo2() {
// safe to use intrinsics dependent on "feature2"
}
fn foo_fallback() {}
// will not compile unless `feature1` is enabled for the whole crate
// through `#![cfg(..)]` or something similar
fn func1() {
foo1()
}
fn func2() {
// this macro allows to call feature-gated functions using runtime detection
runtime_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
_ => foo_fallback(),
}
}
// macro will refuse to compile without crate-wise enabled `any(feature1, feature2)` or
// by marking this function `#[target_features(any(feature1, feature2))]`
fn func3() {
runtime_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
}
}
// behaves same as the `runtime_match_feature`, but performs switching at compile time
// depending on the enabled features.
fn func4() {
compile_match_feature!{
"feature1" => foo1(),
"feature2" => foo2(),
_ => foo_fallback(),
}
} Here compiler build sort-of tree of feature restrictions and
Personally I strongly dislike this approach, as it introduces runtime errors for cases which can be detected at compile time, and strong compile time checks is one of the main selling points of Rust for me. |
Something like that is what rust-lang/rfcs#2212 proposes. But
You can also use |
Thank you for the link! Yes, I know about |
Yeah the codegen differences happen because LLVM can't inline across functions which enable different sets of target featues. If the AES feature isn't enabled then no calls to the AES intrinsics can get inlined, which likely destroys performance. It's also technically not undefined behavior I think if you run it on the right CPU, so I think LLVM can't do too much as it has to ensure the program works in that case at least. |
If we'll take
aesni
crate and will remove the following lines fromlib.rs
:Crate will compile even without
RUSTFLAGS="-C target_feature=+aes"
. (although resulting code will be 4 times slower) Is it intended behavior?The text was updated successfully, but these errors were encountered: