-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a Cargo Feature for Enabling SSE #77
Conversation
To my knowledge, My personal opinion (which I think Phil shares) is that we should try and keep as many things in pure Rust as possible, dipping into inline assembly if needed. The Finally, it might be a good idea to pass some information in the |
Yes, is_x86_feature_detected! is available in core. I would've written
it all in Rust, except that:
* The cr4 register is not available in the x86_64 crate nor in rust core
* To my knowledge, there is no way to execute XGETBV and XSETBV in rust
This PR was just a minimal addition of SSE and AVX (just AVX, not AVX2
or AVX-512). I decided to implement these functions in ASM because the
OSDev wiki provided them in ASM, and I didn't want to translate that
into rust at the time.
As for feature flags, yes, making it togglable is a good addition;
however, informing the user of what is available and what is not would
be hard to do in the boot info, since you can't dynamically turn on
and off the use of, say, AVX-512 or FXSAVE in libcore or in your code.
(Something like that would be badass though...)
…On 8/23/19, Matt Taylor ***@***.***> wrote:
To my knowledge, `is_x86_feature_detected!` isn't available in `core`.
My personal opinion (which I think Phil shares) is that we should try and
keep as many things in pure Rust as possible, dipping into inline assembly
if needed. The `x86_64` crate provides abstractions for modifying control
registers, and `core` provides abstractions to access `cpuid`, so you should
be able to get quite far in pure Rust.
Finally, it might be a good idea to pass some information in the `BootInfo`
structure letting the kernel know which features were enabled. SSE/SSE2 can
unconditionally be enabled, but should probably be informed about
XSAVE/SSE3/SSE4/AVX/AVX2/AVX512 being enabled or not. An opt-in or opt-out
mechanism for enabling these things might be nice too.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#77 (comment)
--
Signed,
Ethin D. Probst
|
src/main.rs
Outdated
@@ -87,6 +89,9 @@ extern "C" { | |||
|
|||
#[no_mangle] | |||
pub unsafe extern "C" fn stage_4() -> ! { | |||
if is_x86_feature_detected!("avx") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation
Thanks for the pull request! Unconditionally enabling SSE for kernels is a bad idea since it considerably increases the state that the kernel has to save on each context switch, thereby decreasing performance. For that reason, we should only add this feature behind an optional cargo feature. Regarding the implementation: Is there a reason for enabling SSE before switching to 64-bit mode? Otherwise I would prefer to keep the bootstrap assembly unchanged and implement it as a normal Rust function instead (sprinkled with inline assembly), as @64 proposed. If there are missing features in Further, I would like to keep the work of the bootloader to a minimum and move all tasks that can be done in the kernel itself to separate crates instead. Enabling SSE is something that might be useful in the bootloader because it allows to compile the kernel for an SSE target. However, dynamically enabling AVX is not useful for the kernel since the kernel still needs a dynamic check if AVX was enabled or not before using it. So it can just enable AVX itself if desired or call into a crate that performs the initialization. |
I unconditionally enabled SSE because while the OSDev wiki used EAX
for enabling it, I didn't know if I could switch those registers to
RAX instead (for some reason, inline assembly does not allow me to use
EAX in 64-bit mode, despite the register being available, or so says
the intel SDMs). Furthermore, I unconditionally enabled SSE because I
have no idea how you could target libcore to use SSE but not until its
enabled without #UDing. The same goes for AVX.
…On 8/24/19, Philipp Oppermann ***@***.***> wrote:
Thanks for the pull request!
Unconditionally enabling SSE for kernels is a bad idea since it considerably
increases the state that the kernel has to save on each context switch,
thereby decreasing performance. For that reason, we should only add this
feature behind an optional cargo feature.
Regarding the implementation: Is there a reason for enabling SSE before
switching to 64-bit mode? Otherwise I would prefer to keep the bootstrap
assembly unchanged and implement it as a normal Rust function instead
(sprinkled with inline assembly), as @64 proposed. If there are missing
features in `x86_64`, we can of course add them.
Further, I would like to keep the work of the bootloader to a minimum and
move all tasks that can be done in the kernel itself to separate crates
instead. Enabling SSE is something that might be useful in the bootloader
because it allows to compile the kernel for an SSE target. However,
dynamically enabling AVX is not useful for the kernel since the kernel still
needs a dynamic check if AVX was enabled or not before using it. So it can
just enable AVX itself if desired or call into a crate that performs the
initialization.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#77 (comment)
--
Signed,
Ethin D. Probst
|
If you don’t know how to do something, hop on gitter and I’ll be happy to help out. But yeah, I agree with Phil here. Better to have an option to enable SSE/SSE2 as a cargo feature, and better to stick to pure rust with inline asm where possible. |
I've moved SSE code into rust (in stage 4) and made SSE and AVX a feature. AVX requires SSE features to be enabled (though only in the cargo manifest). Te bootloader will also check before enabling SSE and AVX to ensure they're actually supported. AVX still calls ASM code (I submitted a gitter msg about the problem). In summary, the Intel SDMs say the following on XGETBV:
And this on XSETBV:
I know, that's a lot. But I'm not really sure how to translate this: enable_avx:
push rax
push rcx
xor rcx, rcx
xgetbv
or eax, 7
xsetbv
pop rcx
pop rax
ret into rust. I'll definitely need to use inline ASM either way. |
I have updated the SSE support and removed AVX. I use bit_field for now to set the bits for CR0 and CR4, but anyone is free to change this. I've tested that code though in my kernel and it does work! |
Accidentally closed the PR. I'd love more tests; mine can't be the only ones... :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, looks much better now!
I left a few comments below, otherwise this looks good to me.
Cargo.toml
Outdated
@@ -17,6 +17,7 @@ xmas-elf = { version = "0.6.2", optional = true } | |||
x86_64 = { version = "0.7.2", optional = true } | |||
usize_conversions = { version = "0.2.0", optional = true } | |||
fixedvec = { version = "0.2.4", optional = true } | |||
bit_field = "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wildcard dependencies are not recommended because cargo might select an incompatible version. For example, it can lead to compilation failures when another dependency requires a very old version of bit_field. Just set it to the latest version (0.10.0) instead.
Cargo.toml
Outdated
@@ -17,6 +17,7 @@ xmas-elf = { version = "0.6.2", optional = true } | |||
x86_64 = { version = "0.7.2", optional = true } | |||
usize_conversions = { version = "0.2.0", optional = true } | |||
fixedvec = { version = "0.2.4", optional = true } | |||
bit_field = "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please make this dependency optional like the other dependencies above.
Cargo.toml
Outdated
@@ -34,6 +35,7 @@ binary = ["xmas-elf", "x86_64", "usize_conversions", "fixedvec", "llvm-tools", " | |||
vga_320x200 = ["font8x8"] | |||
recursive_page_table = [] | |||
map_physical_memory = [] | |||
sse = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After making the bit_field
dependency optional, you need to change this line to
sse = [] | |
sse = ["bit_field"] |
src/main.rs
Outdated
asm!("mov $0, %cr4" :: "r" (cr4) : "memory"); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this into an enable_sse
function in a new sse
module? Also, it should be called from main
, not stage_4
.
test-kernel/Cargo.toml
Outdated
@@ -6,3 +6,4 @@ edition = "2018" | |||
|
|||
[dependencies] | |||
x86_64 = "0.3.4" | |||
bootloader = {path = "..", features=["sse"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this is a good way to test it because we no longer test it without the sse
feature this way. Instead we should create multiple test kernels with different feature combinations like we do for bootimage. This does not need to be part of this PR, though.
I think the best way forward is to merge this without tests (undoing the modifications to the test-kernel
) and add proper tests in a follow-up pull request.
I am confused; where, exactly, is the main function defined? (Unless
you mean the main function of the target kernel...) I say this because
the intent of the SSE feature is for SSE to be enabled *before* the
kernel loads/executes, not after.
…On 9/13/19, Philipp Oppermann ***@***.***> wrote:
phil-opp commented on this pull request.
Thanks for the update, looks much better now!
I left a few comments below, otherwise this looks good to me.
> @@ -17,6 +17,7 @@ xmas-elf = { version = "0.6.2", optional = true }
x86_64 = { version = "0.7.2", optional = true }
usize_conversions = { version = "0.2.0", optional = true }
fixedvec = { version = "0.2.4", optional = true }
+bit_field = "*"
Wildcard dependencies are not recommended because cargo might select an
incompatible version. For example, it can lead to compilation failures when
another dependency requires a very old version of bit_field. Just set it to
the latest version (0.10.0) instead.
> @@ -17,6 +17,7 @@ xmas-elf = { version = "0.6.2", optional = true }
x86_64 = { version = "0.7.2", optional = true }
usize_conversions = { version = "0.2.0", optional = true }
fixedvec = { version = "0.2.4", optional = true }
+bit_field = "*"
Also, please make this dependency optional like the other dependencies
above.
> @@ -34,6 +35,7 @@ binary = ["xmas-elf", "x86_64", "usize_conversions",
> "fixedvec", "llvm-tools", "
vga_320x200 = ["font8x8"]
recursive_page_table = []
map_physical_memory = []
+sse = []
After making the `bit_field` dependency optional, you need to change this
line to
```suggestion
sse = ["bit_field"]
```
> + flags.set_bit(10, true);
+ unsafe {
+ Cr0::write_raw(flags);
+ }
+ // For now, we must use inline ASM here
+ let mut cr4: u64;
+ unsafe {
+ asm!("mov %cr4, $0" : "=r" (cr4));
+ }
+ cr4.set_bit(9, true);
+ cr4.set_bit(10, true);
+ unsafe {
+ asm!("mov $0, %cr4" :: "r" (cr4) : "memory");
+ }
+ }
+
Could you move this into an `enable_sse` function in a new `sse` module?
Also, it should be called from `main`, not `stage_4`.
> @@ -6,3 +6,4 @@ edition = "2018"
[dependencies]
x86_64 = "0.3.4"
+bootloader = {path = "..", features=["sse"]}
I'm not sure if this is a good way to test it because we no longer test it
without the `sse` feature this way. Instead we should create multiple test
kernels with different feature combinations like we do for
[bootimage](https://github.com/rust-osdev/bootimage). This does not need to
be part of this PR, though.
I think the best way forward is to merge this without tests (undoing the
modifications to the `test-kernel`) and add proper tests in a follow-up pull
request.
--
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub:
#77 (review)
--
Signed,
Ethin D. Probst
|
…emoved botloader dep from test kernel.
Thanks for the updates! Looks good now.
Oh sorry, I meant the I think a good place for the call to |
Pushed new change, SSE feature now fully implemented (to my
knowledge). Lines 147-150 of main.rs.
…On 9/20/19, Philipp Oppermann ***@***.***> wrote:
Thanks for the updates! Looks good now.
> I am confused; where, exactly, is the main function defined?
Oh sorry, I meant the `read_elf` function (we should really rename it to
something more fitting). The `stage_4` function is just a thin wrapper that
sets the `ss` segment and reads the addresses from the extern statics, so
extending it doesn't seem fitting.
I think a good place for the call to `enable_sse` is somewhere at the end of
the `read_elf` function, e.g. right before the `let entry_point = …` line.
--
You are receiving this because you modified the open/close state.
Reply to this email directly or view it on GitHub:
#77 (comment)
--
Signed,
Ethin D. Probst
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Let get this merged.
Published as version 0.8.1 |
SSE and AVX support are now enabled if found. As a result, SSe, SSE2, and XSAVE support are now required and will halt the boot process if they are not found. AVX support is enabled at stage 4 and is not required.