|
| 1 | +# RFC 0015: Opportunistic `no_std` |
| 2 | + |
| 3 | +## Context |
| 4 | + |
| 5 | +`no_std` is, for most of what libdatadog does, the better Rust. Not because we want to run on bare metal, but because the constraints `no_std` imposes — explicit allocation, no hidden global state, no `std::` machinery dragged in transitively — line up almost perfectly with the constraints we are *already* trying to honour as a library that ships into other people's runtimes. |
| 6 | + |
| 7 | +Concretely, four things make `no_std` attractive for this workspace: |
| 8 | + |
| 9 | +1. **Signal safety by construction.** `core` and `alloc` (with a signal-safe allocator) are made of pure functions, integer math, and stack-allocated data. None of `std`'s mutex, thread-local, environment, file-descriptor, or panic-handler machinery is reachable. Code that runs in async-signal contexts — crashtracker, profiling samplers, anything called from a signal handler — is *much* easier to keep correct when `std::` is simply not in the import graph. The compiler enforces what code review otherwise has to. |
| 10 | +2. **Smaller artifacts.** Embedders linking libdatadog statically pay for everything `std` pulls in, whether they use it or not. `no_std + alloc` lets us ship the same functionality with substantially less code in the final binary, and noticeably faster compiles in the tree. |
| 11 | +3. **Dependency hygiene.** Once a crate is `no_std`, every new dependency has to be added with `default-features = false` and an explicit story for what it pulls in. This is the dependency review we should be doing anyway; `no_std` makes the friction visible at the point of decision instead of months later when an embedder asks why their binary doubled in size. |
| 12 | +4. **Frequently, it's a mechanical change.** A surprising amount of "make this `no_std`" work is replacing `std::` with `core::` and adding `extern crate alloc;`. yaml/yaml-serde#8 is a recent example: a near-mechanical patch turned an `std` crate into a `no_std + alloc` crate without changing its API. Many of our internal crates are in the same shape. |
| 13 | + |
| 14 | +The first concrete driver in this workspace is `libdd-library-config` (prototyped in the sibling worktree `no-std-library-config`), but the case generalises: data structures, parsers, protocol definitions, error types, and signal-handler-adjacent code all benefit. The exceptions — sockets, files, threads, processes — are real but bounded. |
| 15 | + |
| 16 | +This RFC proposes the policy. |
| 17 | + |
| 18 | +## The thesis |
| 19 | + |
| 20 | +**Prefer `no_std + alloc`. Use `std` only where it is earning its keep.** |
| 21 | + |
| 22 | +Concretely, that means: |
| 23 | + |
| 24 | +- For **new crates**, the default should be `no_std + alloc` unless the crate's reason for existing is OS interaction. |
| 25 | +- For **existing crates**, `no_std` support is added opportunistically: whenever a crate is touched substantially, or whenever a downstream consumer asks, evaluate whether the migration is cheap. If it is — and for many of our crates it will be — do it. |
| 26 | +- For **signal-handler-adjacent code paths** (crashtracker, profiling sample paths, any future async-signal-safe component), `no_std` is the strongly preferred default *for correctness reasons*, not just ergonomics. The compiler refusing to let you call `std::sync::Mutex` from a signal handler is exactly the property we want. |
| 27 | + |
| 28 | +This is opportunistic in the sense that we are not going to stop the world and rewrite the workspace. It is *not* opportunistic in the sense of "only when convenient" — when the opportunity arises, we should take it. |
| 29 | + |
| 30 | +## Crate conventions |
| 31 | + |
| 32 | +Crates that opt in follow the same shape so the workspace stays uniform. |
| 33 | + |
| 34 | +**Default to `std` for source compatibility.** Every `no_std`-capable crate keeps `std` in its default features. Adding `no_std` support is a non-breaking change; existing consumers do not need to know. |
| 35 | + |
| 36 | +```toml |
| 37 | +[features] |
| 38 | +default = ["std"] |
| 39 | +std = [ |
| 40 | + "serde/std", |
| 41 | + "anyhow/std", |
| 42 | + "dep:libc", |
| 43 | + "dep:memfd", |
| 44 | + # ... and any optional deps that only make sense with std |
| 45 | +] |
| 46 | +``` |
| 47 | + |
| 48 | +**Crate root.** Conditional `no_std`, unconditional `alloc`. We rely on a heap; we do not target true bare-metal. |
| 49 | + |
| 50 | +```rust |
| 51 | +#![cfg_attr(not(feature = "std"), no_std)] |
| 52 | +extern crate alloc; |
| 53 | +``` |
| 54 | + |
| 55 | +**Imports.** Use `core::` and `alloc::` everywhere they exist. Gate genuinely `std`-only items behind `#[cfg(feature = "std")]`: |
| 56 | + |
| 57 | +```rust |
| 58 | +use alloc::string::String; |
| 59 | +use alloc::vec::Vec; |
| 60 | +use core::cell::OnceCell; |
| 61 | + |
| 62 | +#[cfg(feature = "std")] |
| 63 | +use std::path::Path; |
| 64 | +``` |
| 65 | + |
| 66 | +**Dependencies.** Every dependency is declared `default-features = false`. Anything the dependency only exposes under its own `std` feature is forwarded through this crate's `std` feature. Optional dependencies that are inherently `std` (`libc`, `memfd`, `prost`, etc.) live behind `dep:` in the `std` feature list. |
| 67 | + |
| 68 | +**Errors.** `thiserror` v2 and `anyhow` (with `default-features = false`) work in `no_std` and should be preferred over hand-rolled error enums. |
| 69 | + |
| 70 | +## Workspace enforcement |
| 71 | + |
| 72 | +When a crate opts in: |
| 73 | + |
| 74 | +- CI builds it with `--no-default-features` in addition to the default build. Without this, a careless `use std::` lands and silently breaks embedders. |
| 75 | +- The crate's `README.md` documents `no_std` support and how to disable `std`. |
| 76 | +- Reviewers treat a broken `--no-default-features` build the same as a broken default build. |
| 77 | + |
| 78 | +For crates that have not opted in, none of this applies, and reviewers do not block PRs on it. The policy is opt-in, not retroactive. |
| 79 | + |
| 80 | +## Forks of upstream crates |
| 81 | + |
| 82 | +Some migrations require an upstream change. `yaml-serde` is the in-flight example (yaml/yaml-serde#7 for the `no_std` work, yaml/yaml-serde#8 as a smaller mechanical patch). Forking upstream is acceptable on these terms: |
| 83 | + |
| 84 | +- An upstream PR exists and is linked from `Cargo.toml` with a `# TODO: Switch to crates.io once <link> is merged` comment. |
| 85 | +- The fork is pinned by `git` + `rev`, never by branch. |
| 86 | +- The fork lives under DataDog or a maintainer account we control; never a third-party fork. |
| 87 | +- If an upstream PR dies, we either adopt the fork as a maintained crate or drop the `no_std` support that depended on it. We do not let unmaintained forks accrete. |
| 88 | + |
| 89 | +## Initial candidates |
| 90 | + |
| 91 | +Strong candidates, evaluated and migrated in follow-up PRs: |
| 92 | + |
| 93 | +- `libdd-library-config` — already prototyped on `no-std-library-config`. Reference implementation. |
| 94 | +- `libdd-tinybytes` — small, dependency-light building block. |
| 95 | +- `libdd-trace-protobuf` — generated code; should be near-mechanical. |
| 96 | +- `libdd-ddsketch` — pure data structure. |
| 97 | +- `libdd-otel-thread-ctx` — small surface, plausible embedder need. |
| 98 | +- **`libdd-crashtracker` (the collector half).** This is the most interesting case. The crash-time code path runs in a signal handler and must be async-signal-safe. A `no_std` collector half — where the compiler refuses to let you reach for `std::sync::Mutex` or `eprintln!` — is meaningfully *safer by construction* than the current crate, independent of any embedder request. The reporting/serialisation half that runs post-crash in a separate process can stay `std`. Splitting the crate along that line is a separate piece of design work, but the `no_std` argument is the forcing function. |
| 99 | + |
| 100 | +Crates that are out of scope by nature — their reason for existing is OS interaction: `datadog-sidecar*`, `datadog-ipc*`, `libdd-shared-runtime*`, `libdd-http-client`, `libdd-data-pipeline`, `spawn_worker`, all `*-ffi` shells. These stay `std`. |
| 101 | + |
| 102 | +## Drawbacks |
| 103 | + |
| 104 | +- **Build matrix grows.** Each opted-in crate adds a `--no-default-features` build to CI. Real but bounded. |
| 105 | +- **Cognitive overhead in opted-in crates.** Contributors have to use `core::`/`alloc::` and gate `std`-only code. We consider this a feature: it forces the same discipline we'd want at code-review time anyway. |
| 106 | +- **Adding a dependency becomes a small research task.** Does it support `no_std`? With which features? Mostly this is good — it discourages casual dependency growth — but it is friction. |
| 107 | +- **Forks accumulate maintenance debt.** Mitigated by the fork rules above, not eliminated. |
| 108 | + |
| 109 | +## Alternatives considered |
| 110 | + |
| 111 | +- **Workspace-wide `no_std` mandate.** Rejected: forces awkward abstractions onto crates whose domain is genuinely OS-bound, with no benefit. |
| 112 | +- **Never go `no_std`.** Rejected: gives up the signal-safety, binary-size, and dependency-hygiene wins; blocks embedder use cases that are already arriving. |
| 113 | +- **Parallel `*-core` crates per opt-in.** Rejected: source duplication, split issue trackers, two places to land every fix. |
| 114 | +- **Defer until customers explicitly demand it.** We already have one in flight. Deferring means landing one-off `no_std` support per consumer and accumulating no shared conventions. |
| 115 | + |
| 116 | +## Recommended |
| 117 | + |
| 118 | +Adopt the policy: prefer `no_std + alloc`; use `std` only where it is earning its keep. Land `libdd-library-config` `no_std` support as the reference implementation, including the CI shape and the conventions above. Schedule `libdd-crashtracker` as the next target on signal-safety grounds. |
0 commit comments