Skip to content

Tracking Issue for secure random data generation in std #130703

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 4 tasks
joboet opened this issue Sep 22, 2024 · 154 comments
Open
2 of 4 tasks

Tracking Issue for secure random data generation in std #130703

joboet opened this issue Sep 22, 2024 · 154 comments
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@joboet
Copy link
Member

joboet commented Sep 22, 2024

Feature gate: #![feature(random)]

This is a tracking issue for secure random data generation support in std.

Central to this feature are the Random and RandomSource traits inside core::random. The Random trait defines a method to create a new random value of the implementing type from random bytes generated by a RandomSource. std also exposes the platform's secure random number generator via the DefaultRandomSource type which can be conveniently access via the random::random function.

Public API

// core::random

pub trait RandomSource {
    fn fill_bytes(&mut self, bytes: &mut [u8]);
}

pub trait Random {
    fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;
}

impl Random for bool { ... }
impl Random for /* all integer types */ { ... }

// std::random (additionally)

pub struct DefaultRandomSource;

impl RandomSource for DefaultRandomSource { ... }

pub fn random<T: Random>() -> T { ... }

Steps / History

Unresolved Questions

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@joboet joboet added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 22, 2024
@newpavlov
Copy link
Contributor

newpavlov commented Nov 29, 2024

Disclaimer: I am one of the getrandom developers.

I think it's important for RandomSource methods to properly return potential errors. Getting randomness is an IO operation and it may fail. In some context it's important to process such errors instead of panicking. The error may be either io::Error or something like getrandom::Error (i.e. a thin wrapper around NonZeroU32).

It may be worth to add the following methods to RandomSource:

  • fill_bytes which works with uninitialized buffers, e.g. based on BorrowedBuf. Yes, zeroization of buffers is usually a very small cost compared to a syscall, but it still goes against the zero-cost spirit.
  • Generation of u32 and u64. Some platforms support direct generation of such values (e.g. RDRAND, WASI, etc.). Going through fill_bytes will be a bit less efficient in such cases.
  • Methods for potentially "insecure" generation of random values, but which are less prone to blocking. The HashMap seeding is the most obvious use-case for this.

It's also not clear whether it's allowed to overwrite the default RandomSource supplied with std similarly to GlobalAlloc.

@joboet
Copy link
Member Author

joboet commented Nov 29, 2024

Would you consider rust-lang/libs-team#159 to be a better solution? That one used the Read trait to fulfil everything you mention.

@newpavlov
Copy link
Contributor

No, I don't think it's an appropriate solution. Firstly, it relies on io::Error, while IIUC you intend for this API to be available in core. Secondly, it does not provide methods for generation of u32 and u64. As I wrote, going through the byte interface is not always efficient. Finally, most of io::Read methods are not relevant here.

For the last point I guess we could define a separate DefaultInsecureRandomSource type.

@bstrie
Copy link
Contributor

bstrie commented Nov 29, 2024

Firstly, it relies on io::Error, while IIUC you intend for this API to be available in core.

I don't think this needs to be a blocker. IMO a lot of std::io should be moved into core--not the OS-specific implementations obviously, but all the cross-platform things like type definitions, same as what happened with core::net.

@wwylele
Copy link
Contributor

wwylele commented Nov 29, 2024

fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;

Would it be better to make this explicit with generics? Like fn random<R: RandomSource + ?Sized>(source: &mut R) -> Self;. This gives user the ability to specify the type name when needed.

@ericlagergren
Copy link

ericlagergren commented Nov 29, 2024

I think it's important for RandomSource methods to properly return potential errors.

I (mostly) disagree. CSPRNGs should almost never fail. When they do, users are almost never qualified to diagnose the problem.

For example: golang/go#66821

A compromise is something like this:

trait RandomSource {
    type Error;
    fn fill_bytes(...) {
        self.try_fill_bytes(...).unwrap();
    }
    fn try_fill_bytes(...) -> Result<..., Self::Error>
}

This allows most CSPRNGs to use Error = Infallible, but still has support for weird HSMs, etc.

@newpavlov
Copy link
Contributor

@ericlagergren
I've assumed this trait is for a "system" RNG, which will work together with a #[global_allocator]-like way to register implementation. I don't think that we need a general RNG trait in std/core as I wrote in this comment.

As for design of fallible RNG traits, see the new rand_core crate.

@dhardy
Copy link
Contributor

dhardy commented Nov 30, 2024

pub trait Random {
    fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;
}

This trait (and the topic of random value generation) should be removed from this discussion entirely in my opinion, focussing only on "secure random data generation" as in the title. Why: because (1) provision of secure random data is an important topic by itself (with many users only wanting a byte slice and with methods like from_ne_bytes already providing safe conversion) and (2) because random value generation is a whole other topic (including uniform-ranged samples and much more).

Disclaimer: I am one of the rand developers. rand originally had a similar trait which got removed; the closest surviving equivalent is StandardUniform.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Nov 30, 2024

@newpavlov

It's also not clear whether it's allowed to overwrite the default RandomSource supplied with std similarly to GlobalAlloc.

Overriding the default source in an application that already has one from linking std seems questionable. It's not that I can't imagine any use case for it, but the established pattern for such overrides allows any crate in the dependency graph to do it (it's only an error if you link two such crates), instead of putting the leaf binary/cdylib/staticlib artifact in charge. As you articulated in the context of getrandom, that's a security risk for applications. So a RandomSource equivalent should probably be more restrictive in who can override it, but that's not the existing pattern. It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" (rust-lang/rfcs#3632) -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.

If overriding the std source isn't supported, then it could work the same way as #[panic_handler]: you must supply an implementation if you don't link std, but if you do link std then supplying your own is an error. This would still be extremely useful. Currently, every crate that's (optionally) no_std and needs some randomness, most commonly for seeding Hashers, has to cobble together some sub-par ad-hoc solution to try and get some entropy from somewhere. There's a bunch of partial solutions that are better than nothing (const-random, taking addresses of global/local variables and praying that there's some ASLR, a global counter when atomics are available, cfg-gated access to target-specific sources like CPU cycle counters or x86 RDRAND) but:

  1. How well this works ends up highly platform-specific, in particular none of them work well for wasm32-unknown-unknown and wasm32v1-none targets .
  2. Applications that have access to a better source of entropy and (directly or transitively) use such libraries don't have a good way to enumerate them and make all of them use the better source.

This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom, but evidently that's not happening. Putting this capability into core (or a new no_std sysroot crate, comparable to alloc) has a better chance of solving this coordination problem. Well, at least eventually, once everyone's MSRV has caught up.

Edit: almost forgot that even std::collections::Hash{Map,Set} depend on having a source of random seeds. A way to supply such a source without linking std could help with moving those types to alloc, although as #27242 (comment) points out, it's not backwards compatible to make such a source mandatory for no_std + alloc applications.

@newpavlov
Copy link
Contributor

@hanna-kruppe

Overriding the default source in an application that already has one from linking std seems questionable.

There is a number of reasons to allow overriding:

  1. An alternative interface may be more efficient than the default one (e.g. reading the RNDR register vs doing syscall)
  2. It may help reduce binary size and eliminate potentially problematic fallback paths (e.g. if you know that you do not need the file fallback on Linux)
  3. In some cases it's useful to eliminate non-deterministic inputs (testing, fuzzing)

So a RandomSource equivalent should probably be more restrictive in who can override it, but that's not the existing pattern.

Yes. How about following the getrandom path and allow override only when a special configuration flag is passed to the compiler?

Either way, overriding is probably can be left for later. I think we both agree that we need a way to expose "system" entropy source in std and a way to define this source for std-less targets.

It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.

I agree that ideally we need a unified approach for this kind of problem. I made a similar proposal once upon a time.

But I think it fits fine? Targets with std could implicitly use std_random_impl crate for "external implementation" of the getrandom-like functions and users will be able to override it in application crates if necessary.

How well this works ends up highly platform-specific, in particular none of them work well for wasm32-unknown-unknown and wasm32v1-none targets .

I believe that having std for wasm32-unknown-unknown was a big mistake in the first place and the wasm32v1-none target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.

This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom, but evidently that's not happening.

Well, it has happened, sort of. getrandom is reasonably popular in the ecosystem even after excluding rand users.

The problem is that std already effectively includes its variant of getrandom for HashMap seeding and people reasonably want to get access to that. And I think problem of getting "system" entropy is fundamental enough for having it in std (well, not in the std per se, let's say in the sysroot crate set).

A way to supply such a source without linking std could help with moving those types to alloc, although as #27242 (comment) points out, it's not backwards compatible to make such a source mandatory for no_std + alloc applications.

Can we add yet another sysroot crate for HashMap which will depend on both alloc and the hypothetical "system entropy" crate?

@hanna-kruppe
Copy link
Contributor

But I think it fits fine? Targets with std could implicitly use std_random_impl crate for "external implementation" of the getrandom-like functions and users will be able to override it in application crates if necessary.

The RFC (and the competing ones I've looked at) only supports a default implementation in the crate that "declares" the externally-implementable thing. If that crate isn't std, then an implementation from std would not count as "default" but conflict with any other definition. So we'd need another special carve-out for std (the very thing we'd want to avoid by adding a general language feature), or the language feature needs to become much more general to support overrideable default implementations from another source.

I believe that having std for wasm32-unknown-unknown was a big mistake in the first place and the wasm32v1-none target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.

I was specifically talking about no_std libraries, for which the two targets are basically equivalent. Both don't have any source of entropy implied by the target tuple (instruction set, OS, env, etc.), and if you want to add one it'll have to involve whatever application-specific interface the wasm module has with its host.

Well, it has happened, sort of. getrandom is reasonably popular in the ecosystem even after excluding rand users.

Not to point any fingers but a counter example that's fresh on my mind because I looked at its code recently is foldhash. As another example, ahash only uses getrandom optionally (though it's on by default). If you're only using ahash indirectly through another library that disables the feature, then it's not gonna use getrandom unless you happen to notice this and add a direct dependency to enable the feature. In that case there is a solution, at least, but it's still not discoverable.

Can we add yet another sysroot crate for HashMap which will depend on both alloc and the hypothetical "system entropy" crate?

Possibly, but people may object to a proliferation of sysroot crates so let's hope there's a better solution.

@newpavlov
Copy link
Contributor

newpavlov commented Nov 30, 2024

or the language feature needs to become much more general to support overrideable default implementations from another source.

Yes, it's not as if RFC is a technical specification which must be followed word-by-word. There is a number of cases where the original RFC vision has somewhat changed during implementation stages. If anything, I would say it's an oversight/deficiency of the RFC to not cover cases like this.

foldhash

If a crate aims to minimize its number of dependencies as far as possible even at the cost of code quality and security, it obviously will not depend on getrandom, despite it being the de facto standard for getting system entropy. I think most people will agree that hacks like this have a strong smell. The same applies to fastrand (amusingly, it still uses getrandom for Web WASM) and other "partial solutions" listed by you. As you can notice, both hack their way into using std to get system entropy and either use a fixed seed or pile even more hacks when std is not available.

@hanna-kruppe
Copy link
Contributor

As I said, I have no intention of pointing fingers at any crates. They have to navigate tricky trade-offs and complexities due to Rust's standard library (as a whole, not just std) not yet providing any entropy access. Let's keep this issue focused on changing that.

@workingjubilee
Copy link
Member

The Random trait seems weakly motivated in terms of coupling it to RandomSource, as its design seems like it will be a much more hotly contested space, and it is (mostly) unrelated to RandomSource.

@theemathas
Copy link
Contributor

Note that @dhardy (maintainer of rand) wrote some criticism of this at rust-lang/libs-team#393 (comment)

@abgros
Copy link

abgros commented Feb 6, 2025

Could Random be implemented for arrays? That way, we could write something like:

let random_array: [u64; 100] = random();

which is a lot more convenient than using the fill_bytes() method.

@sorairolake
Copy link
Contributor

I think it would be useful to have a data type like rand::rngs::mock::StepRng to represent a source of randomness. This is useful for testing whether Random for an arbitrary data type T is implemented as expected.

@dhardy
Copy link
Contributor

dhardy commented Feb 8, 2025

I think it would be useful to have a data type like rand::rngs::mock::StepRng to represent a source of randomness. This is useful for testing whether Random for an arbitrary data type T is implemented as expected.

I disagree that it is useful in testing implementations of traits like Random, since there should not be any implied restrictions on such a distribution except that outputs are uniformly distributed, given that the outputs of the random source are truly random. Testing the distribution of output values requires statistical tools, e.g. the KS test as we have here.

@hanna-kruppe
Copy link
Contributor

As far as I know there’s no plan to make the RandomSource trait sealed, so other implementations can be provided by third party crates without having to debate whether it’s a good idea for std to include them.

@jstarks
Copy link

jstarks commented Feb 9, 2025

It seems like one might regret adding Random later--for some times (e.g., u32), there's a natural definition and implementation that produces a uniform distribution, but it happens to correspond to using RandomSource with something like zerocopy/bytemuck/the safe transmute effort. I.e., there's already a reasonable solution for these types.

For more complicated types (e.g., f32 or (T, U)), Random is under-defined. It is unclear what distribution is expected, and without implementations on tuples or a derive for enums/structs, there's really no "reference implementation" for third-party code to model.

It seems better to leave this out rather than leave it in this poorly specified state. Just stabilize RandomSource.

Although... even RandomSource is lacking, in that it doesn't support writing to an uninitialized buffer... Would it be better to wait until this problem is solved in Read (#78485) before committing to this new interface? Adding BorrowedBuf support later will be a headache.

@hanna-kruppe
Copy link
Contributor

Would it be better to wait until this problem is solved in Read (#78485) before committing to this new interface?

I really don't think so. The current interface works perfectly fine, it's just slightly suboptimal in some use cases (fewer than in the context of general byte-centric I/O). Providing only a Read::read_buf-style API would be strictly worse for many common use cases, in that it would be more complicated to use for no benefit. Usually you either want to generate one conveniently-sized integer at a time (probably should have a dedicated trait method for performance reasons), or you want to fill a small fixed-size buffer completely (e.g., cryptographic key material). Scenarios where the cost of buffer initialization is significant, e.g. preparing a large ephemeral buffer but only filling a small part of it before discarding it, are rare.

So I think it's pretty clear that the method for filling a &mut [u8] should exist even if io::Read had already arrived at a stable solution for reading into uninitialized buffers. Adding the latter after stabilizing the former does pose some challenges, but io::Read already has to solve similar challenges -- more of them, actually, because it also has to work well for several layers of I/O adapters, while PRNGs are rarely composed in a way that stresses the performance of data flowing through several layers of composition.

@Kixunil
Copy link
Contributor

Kixunil commented Feb 9, 2025

@hanna-kruppe

Perhaps something like this would work:

// Users don't need to concern themselves with this type.
pub strut OutBuf([MaybeUninit<u8>]);

impl OutBuf {
    // various methods to construct it and write to it, no method to read from it
}

// Users don't need to worry about implementing this, just use it
pub unsafe trait AsOutBuf {
    type Init: ?Sized;

    fn as_out_buf(&mut self) -> &mut OutBuf;
    // Only allowed if the entire buffer was overwritten.
    unsafe fn assume_init(&mut self) -> &mut Self::Init;
}

impl AsOutBuf for [u8] { type Init = [u8]; ... }
impl AsOutBuf for [MaybeUninit<u8>] { type Init = [u8]; ... }
impl<const N: usize> AsOutBuf for [u8; N] { type Init = [u8; N]; ... }
impl<const N: usize> AsOutBuf for [MaybeUninit<u8>; N] { type Init = [u8; N]; ... }

// Users can pass in any byte slice, byte array or their uninitialized counterparts
// The entire value is guaranteed to get overwritten, so if using the returned value is undesirable due to borrowing issues the users can simply use the original value safely, if it was an initialized type to begin with or they may just soundly call `assume_init` on it.
pub fn fill_random_bytes<T: AsOutBuf>(buf: &mut T) -> &mut T::Init { ... }

This way it already supports all reasonable cases and people are not forced into using uninitialized API. They can write let mut buf = [0; 32]; fill_random_bytes(&mut buf); exactly as if they were using the simple API or calling Read::read_exact.

@hanna-kruppe
Copy link
Contributor

That helps the simple call sites remain simple, but it has several downsides:

  1. The more complicated signature makes it less obvious that it can be used like that for simple cases, both for users who aren't reading the rustdoc page from start to finish, and also to the compiler's type inference in some cases.
  2. The reliance on generics means RandomSource is no longer dyn-compatible. This could be fixed by gating the AsOutBuf-using methods on Self: Sized and providing other methods that are dyn-compatible, but at that point you might as well provide two concrete methods for the initialized and possibly-uninitialized case.
  3. The new abstractions compete with Borrowed{Buf,Cursor} and other proposed APIs for "writing into a possibly-uninitialized buffer". std should minimize the number of subtly different and incompatible abstractions for the same problem space, especially if it's something third party libraries would want to use as vocabulary type/trait. Going in this direction would give more reasons to block stabilization of secure random data generation on other APIs that have been in limbo for a long time.
  4. Even if we ignore the previous point and focus only on the narrow use case of RandomSource, there's considerable design space to be explored there. For example, a downside of the specific approach you outlined is that it forces every RandomSource implementation to use unsafe internally while also requiring every caller who wants to use the MaybeUninit capabilities to use another bit of unsafe justified by "the implementation is correct" (and we'd have to make RandomSource unsafe to implement to make this sound). In contrast, the more complicated interface of Borrowed{Buf,Cursor} provides a bunch of useful functionality as safe APIs.

@newpavlov
Copy link
Contributor

newpavlov commented Mar 28, 2025

It's not OS/hardware, so it shouldn't even have std but specific platform like wasi does and that one also has some sort of randomness

I agree, but the unfortunate reality is that Rust does provide std for wasm32-unknown-unknown (I strongly believe it was a mistake, but alas) and will continue to do so because of the backward compatibility guarantees. Yes, we could provide an always panicking implementation of read_sys_entropy on this target, but considering that we don't have a clear path towards a #[global_allocator]-like ability to overwrite the system entropy source, usefulness of the introduced function will be limited. People would have to continue to rely on crates like getrandom for a more universal interface.

Imagine a function/method like KeyInit::generate_key which would be documented as "does not work on no_std targets, always panics on some std targets". Would you be happy with such API?

@hanna-kruppe
Copy link
Contributor

Libraries that don’t require std or want to work on targets with half-broken std are common, but it’s significantly less common for entire applications to be #[no_std] or use the exact same main function on e.g., Linux and wasm32-unknown-unknown. If the program entry point selects the entropy source and explicitly passes it through all libraries, a whole lot of programs written in Rust could still use the std API even if libraries have reasons to avoid it calling it directly. It’s different from the model offered by getrandom (globally accessible source of entropy even in no_std code) and more annoying to write code in this style, but there are also software engineering advantages to explicit dependency injection for such non-determinism. So while this may not be a drop in replacement for getrandom, the delta amounts to “provide a way for consumers to plug in their own global source on platforms where std doesn’t have one” — something std could provide as well in theory, but also doesn’t have to, especially since the best interface isn’t clear.

@briansmith
Copy link
Contributor

I agree, but the unfortunate reality is that Rust does provide std for wasm32-unknown-unknown (I strongly believe it was a mistake, but alas) and will continue to do so because of the backward compatibility guarantees. Yes, we could provide an always panicking implementation of read_sys_entropy on this target,

I think that is the best thing to do. The real issue is that we haven't deprecated the wasm32-unknown-unknown target yet; it is inevitable, but not everybody sees it yet. wasm32-unknown-unknown stuff shouldn't block progress on this.

@Kixunil
Copy link
Contributor

Kixunil commented Mar 28, 2025

will continue to do so because of the backward compatibility guarantees

I don't see any reason to not deprecate it at least and provide more sensible alternative. std already has a bunch of stuff deprecated and replaced with something better. I don't know what the specific use case for wasm32-unknown-unknown is but it seems that any other more specific target is just better when it come to std.

"does not work on no_std targets, always panics on some std targets". Would you be happy with such API?

Being unavailable on no_std and not having a stable RNG trait sucks to be honest, speaking from experience. But that's the current state of things anyway. Some sort of global registry similar to global allocator would indeed help. Anyway, if the second part was "panics on deprecated std targets that go replaced with something sensible years ago" I wouldn't mind.

it’s significantly less common for entire applications to be #[no_std]

Well, minimizing WASM module size by getting rid of std is nice. But that potentially has other challenges.

@bstrie
Copy link
Contributor

bstrie commented Mar 29, 2025

This issue is for secure random data generation: reading bytes directly from the system's CSPRNG. Everything else should happen in a separate issue.

Seconded. All I want is something like io::read_entropy as an alternative to the getrandom crate. Does the libs team need to accept an ACP first or should someone presumptuously make an issue so we can move discussion there?

@joshtriplett
Copy link
Member

We discussed this at length in today's @rust-lang/libs-api meeting. We didn't come to any final conclusions, but here's a summary of the discussion:

We were generally amenable to the idea of exposing low-level randomness in the style of getrandom, if that's something people find useful.

We felt that that was an independent decision, separate from the higher-level interfaces in the style of rand, and we did not discuss treating this as an "instead" rather than an "in addition to". (Whatever interfaces we want for that in core will need more design.)

We discussed whether the low-level method being requested here is effectively the existing std::random::DefaultRandomSource::fill_bytes method.

In discussing the possible differences, we talked about whether RandomSource should be exposing the IO error or not. Originally when reviewing and accepting the ACP that led to this, when we talked about whether RandomSource should expose the IO error or not, we had leaned against, because that'd make implementations of Random harder to write. (if Random doesn't expose the error, because people aren't prepared to deal with that). We also read the comment at #130703 (comment) . On the other hand, we observed that for a low-level function that's meant to replace getrandom, it may make sense to expose any potential IO error. #[must_use] should ensure that people don't ignore the error.

We did agree that, if we want to expose the IO error, we should do so by having trait RandomSource: Read, and dropping the fill_bytes method. That would then mean people could use read, and for that matter read_buf to work with an uninitialized buffer.

Everyone present in the meeting was generally in favor of exposing the IO error, and doing so by having RandomSource: Read. That will complicate implementations of the current trait Random (making them unwrap), but that didn't seem like a problem. Some in the meeting felt that wasn't a problem because the error should be propagated all the way up (e.g. rolling a die or shuffling an array should return an error); others in the meeting felt it wasn't a problem because the interfaces for higher-level randomness (e.g. rolling a die or shuffling an array) need redesigning anyway and we shouldn't block this change on that.

Summary:

  • Remove RandomSource::fill_bytes
  • Make trait RandomSource: Read
  • The low-level method people are looking for is provided by a call to read on DefaultRandomSource.
  • Random and other higher-level bits still need further design. RandomSource and DefaultRandomSource could be stabilized independently.

@ericlagergren
Copy link

@joshtriplett Implementing Read provides some useful features. But what is the answer to "users cannot safely handle errors from the system CSPRNG"? There are several problems that arise:

  1. Users fall back to an insecure source of randomness (the system time is a common one)
  2. Users retry the operation, but the system CSPRNG is broken and the results are unpredictable
  3. Users ignore the error (like you mentioned, this is generally solved by #[must_use])

Also: is RandomSource going to be documented as returning cryptographically secure bytes?

@newpavlov
Copy link
Contributor

newpavlov commented Apr 8, 2025

Make trait RandomSource: Read

This looks really bad for the potential future move of this trait to core. As a getrandom maintainer, I would strongly advice against tying it to anything std::io-related. Why can't std just add a blanket impl<T: RandomSource> io::Read for T { ... }, if we want some compatibility between the traits? As for uninit buffer support, we could just add a read_buf-like method directly to RandomSource in future, I don't see the need to reuse Read::read_buf here.

@hanna-kruppe
Copy link
Contributor

Another (smaller) issue with reusing io::Read is that its set of methods is a poor fit for the "all or nothing" nature any RNG wants. Read::read permits short reads and ErrorKind::Interrupted. While the standard library could and should promise that the impl Read for DefaultRandomSource never does either (i.e., always reads as many bytes as the caller wanted or fails with a "real" error), it'll probably still set of my code review alarm bells to see code like:

let mut buf = [0; N];
random_source.read(&mut buf).expect("CSPRNG shouldn't fail");
something_important(&buf);

Any effort spent flagging this and either rewriting it with Read::read_exact or looking up again that it this specific Read impl doesn't have that problem would be wasted.

@abgros
Copy link

abgros commented Apr 9, 2025

Chiming in to note that getrandom is allowed to succeed while only partially filling the buffer (see the man page):

On success, getrandom() returns the number of bytes that were
copied to the buffer buf.  This may be less than the number of
bytes requested via buflen if either GRND_RANDOM was specified in
flags and insufficient entropy was present in the random source or
the system call was interrupted by a signal.

I guess the question is whether the standard library should try to account for this, or just count it as a failure.

@newpavlov
Copy link
Contributor

newpavlov commented Apr 9, 2025

@abgros
While talking about getrandom most people in this thread mean the getrandom crate, not the getrandom syscall. The crate guarantees to either fully fill the buffer, or return an error. When the syscall returns partially filled buffer, it tries to fill the remainder by calling the syscall again.

I don't see a practical reason to expose a Read::read-like method in the proposed randomness trait.

@joshtriplett
Copy link
Member

@joshtriplett Implementing Read provides some useful features. But what is the answer to "users cannot safely handle errors from the system CSPRNG"? There are several problems that arise:

1. Users fall back to an insecure source of randomness (the system time is a common one)

I certainly hope not. I would expect the handling of any such error to be a graceful exit, and the only reason to expose it rather than panicking is the ability to handle it gracefully.

2. Users retry the operation, but the system CSPRNG is broken and the results are unpredictable

I would not in general expect such errors to be recoverable.

Also: is RandomSource going to be documented as returning cryptographically secure bytes?

DefaultRandomSource should be, yes.

@joshtriplett
Copy link
Member

joshtriplett commented Apr 10, 2025

Make trait RandomSource: Read

This looks really bad for the potential future move of this trait to core.

If we're going to expose errors from the OS, the obvious type for that would be io::Error. I would love to see that move to core, if not for the boxed variant it has. If we don't use io::Error, we'd have to invent our own error type, while ideally being able to store any error from the OS without loss of information, which effectively means we're re-inventing io::Error.

If we decide to panic, or use a different error type, then I agree that we can't use Read.

As for uninit buffer support, we could just add a read_buf-like method directly to RandomSource in future, I don't see the need to reuse Read::read_buf here.

We could absolutely do that, but the question is what we should do about error handling, if anything. If we use io::Error, then Read seems like the obvious choice. If we're avoiding io::Error, then I agree that we should provide a method directly, since it needs a different signature.

@joshtriplett
Copy link
Member

Another (smaller) issue with reusing io::Read is that its set of methods is a poor fit for the "all or nothing" nature any RNG wants. Read::read permits short reads
...
Any effort spent flagging this and either rewriting it with Read::read_exact or looking up again that it this specific Read impl doesn't have that problem would be wasted.

That's a valid point.

@newpavlov
Copy link
Contributor

newpavlov commented Apr 10, 2025

If we're going to expose errors from the OS, the obvious type for that would be io::Error

The obvious approach is to use the Error associated type in the trait definition. Then SystemSource defined in std could use Error = io::Error while PRNG implementations in third-party crates could use Error = !. Though I would prefer for SystemSource to use a getrandom::Error-like type.

Somewhat tangential rant:

I believe that io::Error has a somewhat problematic design and should be avoided whenever possible.

It covers two very distinct cases: "error code from OS" and "any error under the sun abstracted by Box<dyn core::error::Error> and assembled into error stack". In my opinion, ideally, io::Error should've focused only on the first case, while the second cases should be handled by third-party crates like anyhow.

Yes, we need some limited extendability for things like Error::WRITE_ALL_EOF and potentially for custom user codes, but it's probably a solvable problem (e.g. by using "distributed slices"). Especially considering that targets usually have well defined ranges of possible system error codes, so we can carve out a range for custom error codes defined on the Rust side.

Making io::Error "thin wrapper" around RawOsError would've also made it trivial to move it to core, together with most of the io traits. But unfortunately we are stuck with the current design...

@joshtriplett
Copy link
Member

joshtriplett commented Apr 10, 2025

The obvious approach is to use the Error associated type in the trait definition.

I don't think that's obvious, and it has disadvantages: code that's generic over RandomSource would then have to either box the error or thread that generic through its own error type. I would sooner suggest a std::random::Error type instead, if we don't want to use io::Error.

It covers two very distinct cases: "error code from OS" and "any error under the sun abstracted by Box<dyn core::error::Error> and assembled into error stack".

I agree with you completely, and that's a substantial problem. In the absence of that boxed variant, io::Error (and traits depending on it) would likely have moved into core years ago.

@newpavlov
Copy link
Contributor

newpavlov commented Apr 10, 2025

I don't think that's obvious, and it has disadvantages: code that's generic over RandomSource would then have to either box the error or thread that generic through its own error type.

Judging by the rand experience, it's not really a problem in practice. Most users rely on infallible PRNGs while OsRng is mostly used for seeding. In other words, almost all users will see only two options: Error = ! or Error = io::Error/random::Error. We also could introduce an UnwrapErr-like wrapper for users who don't want to bother with handling of extremely rare errors.

Cryptographic code is a bit different in this regard since it's more common to use OsRng and even specialized HW RNGs directly (e.g. for key generation) without relying on user-space PRNGs (they are also used, but for less "critical" material such as nonces). In RustCrypto we handle it by providing two sets of methods: fallible and infallible (e.g. see the KeyInit trait).

@jstarks
Copy link

jstarks commented Apr 10, 2025

In the absence of that boxed variant, io::Error (and traits depending on it) would likely have moved into core years ago.

Is that really true? I had assumed, perhaps incorrectly, that core was meant to contain only things that don't depend on the operating system, kind of the equivalent of what's required from a "freestanding" implementation in C, as opposed to a "hosted" one (which corresponds to what's in std). And then alloc is just "stuff that's not OS-specific but requires an allocator".

But io::Error is certainly OS specific: its interpretation of the OS error code depends on the environment it's built for. I don't see how it makes sense in core based how I understood its definition. And I have no idea how io::Error would behave for targets like x86_64-unknown-none.

Is there anything else in core that is OS specific in this way?

Related, when people talk about having DefaultRandomSource in core, is the idea that it would have the actual batteries-included, OS-specific implementation that we're currently talking about putting in std? Or is the request for some kind of #[global_default_random_source] attribute, or whatever, to allow the binary crate author to plug in what they want?

Because if it's the former then, again, I don't understand what core is supposed to be.

@ericlagergren
Copy link

@joshtriplett Implementing Read provides some useful features. But what is the answer to "users cannot safely handle errors from the system CSPRNG"? There are several problems that arise:

1. Users fall back to an insecure source of randomness (the system time is a common one)

I certainly hope not. I would expect the handling of any such error to be a graceful exit, and the only reason to expose it rather than panicking is the ability to handle it gracefully.

Unfortunately, users do fall back to insecure sources of randomness. :/

2. Users retry the operation, but the system CSPRNG is broken and the results are unpredictable

I would not in general expect such errors to be recoverable.

Also: is RandomSource going to be documented as returning cryptographically secure bytes?

DefaultRandomSource should be, yes.

If there is going to be a RandomSource trait and if it is not documented as being cryptographically secure then I'd like to push for a CryptoRng (or similar) marker trait.

My goal here is just to make sure these new APIs are misuse resistant.

@joshtriplett
Copy link
Member

In the absence of that boxed variant, io::Error (and traits depending on it) would likely have moved into core years ago.

Is that really true? I had assumed, perhaps incorrectly, that core was meant to contain only things that don't depend on the operating system, kind of the equivalent of what's required from a "freestanding" implementation in C, as opposed to a "hosted" one (which corresponds to what's in std). And then alloc is just "stuff that's not OS-specific but requires an allocator".

But io::Error is certainly OS specific: its interpretation of the OS error code depends on the environment it's built for. I don't see how it makes sense in core based how I understood its definition. And I have no idea how io::Error would behave for targets like x86_64-unknown-none.

(Note: further discussion on this would be off topic for this tracking issue; for that, I'd suggest #t-libs on Zulip, or internals.rust-lang.org.)

core is still built for a particular target, and is specific to that target. It uses the target's ABI, the target's types and sizes (e.g. core::ffi), some target-specific startup machinery, and potentially in the future some SIMD machinery and thread-local storage machinery.

Teasing apart the parts of io::Error that make sense for core would be a complicated process. I would expect, for instance, that the version in core would include io::ErrorKind, but not the boxed variants or raw OS error codes.

Related, when people talk about having DefaultRandomSource in core,

I would not expect DefaultRandomSource to be in core; that belongs in std. I would expect the RandomSource trait to be in core, along with various APIs that use randomness (e.g. shuffling a slice, or generating a random number in a range).

@newpavlov
Copy link
Contributor

newpavlov commented Apr 10, 2025

@jstarks
(The io::Error discussion is off topic, so I will not reply to those parts, you can search for previous discussions to get more context)

Related, when people talk about having DefaultRandomSource in core, is the idea that it would have the actual batteries-included, OS-specific implementation that we're currently talking about putting in std? Or is the request for some kind of #[global_default_random_source] attribute, or whatever, to allow the binary crate author to plug in what they want?

The idea is to start with the former and maybe move to the latter in future (in a separate sysroot crate). IIUC the main blocker for the latter is lack of a unified way to handle #[global_allocator]-like items.

Having the randomness trait defined in core is desirable not only because of the hypothetical #[global_default_random_source] but also because it would provide a standard API for RNG implementations which would be used by both RNG implementers and users in the wider ecosystem.

@Amanieu Amanieu removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Apr 15, 2025
@Amanieu
Copy link
Member

Amanieu commented Apr 15, 2025

Regarding error handling, one possibility would be to only produce an error when the DefaultRandomSource is created, but keeping the RandomSource trait infallible. This means at the default random source can be seeded once when it is created (which may fail due to the OS). Once seeded, generating random data is infallible and should still be secure as long as a proper CSPRNG is used.

@ericlagergren
Copy link

ericlagergren commented Apr 15, 2025

Regarding error handling, one possibility would be to only produce an error when the DefaultRandomSource is created, but keeping the RandomSource trait infallible. This means at the default random source can be seeded once when it is created (which may fail due to the OS). Once seeded, generating random data is infallible and should still be secure as long as a proper CSPRNG is used.

I strongly disagree with this. Rust should not provide an in-process CSPRNG. Nobody (*) should be using an in-process CSPRNG for cryptographic purposes. See this comment and the responses #130703 (comment)

(I really do appreciate the error handling consideration, though.)

*: Except when it's the only option, or when required to (FIPS, Common Criteria, etc.).

@newpavlov
Copy link
Contributor

newpavlov commented Apr 16, 2025

@Amanieu
I don't think it's a good option. You effectively repeat the rand::ThreadRng design and it's not as straightforward at it seems at the first glance. For example, have you thought about fork safety? In cryptographic applications it's also somewhat easier to rely on the OS API, than on a user-space re-seeded PRNG. You don't need to review CSPRNG implementation in std and in some cases you simply can not use "non-standard" cryptographic algorithms in your software (e.g. in certified software).

I think it may be worth to include a ThreadRng-like type into std, but it should be a separate source in addition to a "system" source.

@ChrisDenton
Copy link
Member

This thread is long and has gone on a few tangents so I'll attempt to go over some of the possibilities put forward. I'll try to keep it short so sorry if I miss any nuances. I'll list them in an order that roughly reflects my preference.

Names below are open to bikeshedding but for the purposes of designing the API, I feel it important to settle on the shape of the API before figuring out the perfect name.


mod random {
    // Fill some bytes using the OS's cryptographically secure rng.
    fn fill_bytes(buf: &mut [u8]);
}

The argument for this is that it has the least moving parts, is easy to document and documentation is all in one place. It doesn't rely on anything outside of the function (so just read its docs) and can't go wrong which makes auditing cryptographic code easier. No error is returned; either it fills all the bytes or, in the unlikely event it fails, the program does not continue. This means there's no potential for the buffer to be used after the call unless it succeeds. It does also mean that there's no way to recover from an error (similar to HashMap with the default hasher). See #130703 (comment) and golang/go#66821 for a discussion on (in)fallibility.


A slight variation on the first option is to return an error instead of dying.

mod random {
    fn fill_bytes(buf: &mut [u8]) -> Result<(), SecureRngError>;
}

This allows applications to recover from the failure (e.g. maybe they didn't need cryptographically secure random data after all?). SecureRngError could have From/Into implementations for io::Error so ? just works.

Alternatively we could simply use io::Error though I don't think there's anything applications can do with a specific OS error code here, other than maybe log it. Matching on error kinds, for example, would be wrong. However, an io::Error may be more convenient I guess.


Another option is to use the RandomSource trait to be generic over non-OS random sources. This is the original ACP design. A short excerpt from the OP:

// A trait that can be implemented for all kinds of random sources.
trait RandomSource {
    fn fill_bytes(&mut self, buf: &mut [u8]); 
}
// The OS's cryptographically secure rng.
pub struct DefaultRandomSource;
impl RandomSource for DefaultRandomSource { ... }

This would allow it to be used with the proposed Random API. The nature of that API is somewhat off topic for this issue but it has been brought up that coupling insecure and secure random sources does not provide much benefit but increases the chances of misuse. And even if we did want to do this then having a free function does not mean we can't add the type in the future that expands its use. For example, alloc::alloc::alloc and Vec both exist at the same time.


The fill_bytes function that returns an io::Error looks quite a bit like the Read trait, or at least the Read::read_exact function. So another option would be to have a type that implements Read.

// Uses the OS's cryptographically secure rng.
struct DefaultRandomSource;
impl Read for DefaultRandomSource {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> { ... }
    fn read_exact(&mut self, buf: &mut [u8]) -> io::Result<()> {
        self.read(&mut buf).map(|_| ())
    }
}

The upside to this is that it uses a familiar trait. The downside is that the Read trait has very loose requirements (e.g. it can partially fill a buffer, it can be interrupted, etc) whereas it's important for people implementing cryptographic code to be confident that the buffer is always fully filled. That can be solved by reading the docs for DefaultRandomSource (and mostly ignoring the docs for Read) but using the generic Read trait for cryptographically secure random bytes does not seem to add any benefit that's worth the hassle.

@ChrisDenton
Copy link
Member

I would also just add that crates like getrandom and rand will likely always exist. The standard library does not have to fully subsume all ecosystem crates nor provide functions for every possible use case.

And I'd again emphasise that adding a free standing function does not necessarily prevent adding a type + trait later so it doesn't have to be an either/or situation.

@newpavlov
Copy link
Contributor

@ChrisDenton
As the most minimal viable function and as a starting point, I support your first potentially panicking fill_bytes function. As a small amendment, I would relax its docs a bit from "the OS's cryptographically secure rng" to "cryptographically secure non-deterministic rng". In future it may be reasonable for it to use a ThreadRng-like source.

But this function should not be the end point. I strongly believe that we need RNG traits in core and RNG source structs in std, but I guess it's better to move their discussion into a separate issue.

I would also just add that crates like getrandom and rand will likely always exist.

As a getrandom maintainer, I really hope that getrandom will be eventually fully deprecated in favor of a new sysroot crate.

@ericlagergren
Copy link

@ChrisDenton As the most minimal viable function and as a starting point, I support your first potentially panicking fill_bytes function.

I very much agree with this. I think it is a good decision.

As a small amendment, I would relax its docs a bit from "the OS's cryptographically secure rng" to "cryptographically secure non-deterministic rng". In future it may be reasonable for it to use a ThreadRng-like source.

But I also very much disagree with this for the reasons stated in my (and your) previous comments. If Rust ever needs an in-process CSPRNG then it should have a separate API that informs the user about the risks, e.g., fork safety, swap safety, reseeding, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests