Unrolled build for rust-lang#117534

rust-timer · web-flow · commit 54a6bd0d195a · 2023-11-04T19:12:57.000-04:00
Rollup merge of rust-lang#117534 - RalfJung:str, r=Mark-Simulacrum clarify that the str invariant is a safety, not validity, invariant Updates these docs to match rust-lang/reference#792
diff --git a/library/core/src/primitive_docs.rs b/library/core/src/primitive_docs.rs
@@ -291,7 +291,7 @@ mod prim_never {}
 /// Surrogate code points, used by UTF-16, are in the range 0xD800 to 0xDFFF.
 ///
 /// No `char` may be constructed, whether as a literal or at runtime, that is not a
-/// Unicode scalar value:
+/// Unicode scalar value. Violating this rule causes undefined behavior.
 ///
 /// ```compile_fail
 /// // Each of these is a compiler error
@@ -308,9 +308,10 @@ mod prim_never {}
 /// let _ = unsafe { char::from_u32_unchecked(0x110000) };
 /// ```
 ///
-/// USVs are also the exact set of values that may be encoded in UTF-8. Because
-/// `char` values are USVs and `str` values are valid UTF-8, it is safe to store
-/// any `char` in a `str` or read any character from a `str` as a `char`.
+/// Unicode scalar values are also the exact set of values that may be encoded in UTF-8. Because
+/// `char` values are Unicode scalar values and functions may assume [incoming `str` values are
+/// valid UTF-8](primitive.str.html#invariant), it is safe to store any `char` in a `str` or read
+/// any character from a `str` as a `char`.
 ///
 /// The gap in valid `char` values is understood by the compiler, so in the
 /// below example the two ranges are understood to cover the whole range of
@@ -324,11 +325,10 @@ mod prim_never {}
 /// };
 /// ```
 ///
-/// All USVs are valid `char` values, but not all of them represent a real
-/// character. Many USVs are not currently assigned to a character, but may be
-/// in the future ("reserved"); some will never be a character
-/// ("noncharacters"); and some may be given different meanings by different
-/// users ("private use").
+/// All Unicode scalar values are valid `char` values, but not all of them represent a real
+/// character. Many Unicode scalar values are not currently assigned to a character, but may be in
+/// the future ("reserved"); some will never be a character ("noncharacters"); and some may be given
+/// different meanings by different users ("private use").
 ///
 /// [Unicode code point]: https://www.unicode.org/glossary/#code_point
 /// [Unicode scalar value]: https://www.unicode.org/glossary/#unicode_scalar_value
@@ -887,8 +887,6 @@ mod prim_slice {}
 /// type. It is usually seen in its borrowed form, `&str`. It is also the type
 /// of string literals, `&'static str`.
 ///
-/// String slices are always valid UTF-8.
-///
 /// # Basic Usage
 ///
 /// String literals are string slices:
@@ -942,6 +940,14 @@ mod prim_slice {}
 /// Note: This example shows the internals of `&str`. `unsafe` should not be
 /// used to get a string slice under normal circumstances. Use `as_str`
 /// instead.
+///
+/// # Invariant
+///
+/// Rust libraries may assume that string slices are always valid UTF-8.
+///
+/// Constructing a non-UTF-8 string slice is not immediate undefined behavior, but any function
+/// called on a string slice may assume that it is valid UTF-8, which means that a non-UTF-8 string
+/// slice can lead to undefined behavior down the road.
 #[stable(feature = "rust1", since = "1.0.0")]
 mod prim_str {}