@@ -3572,6 +3572,29 @@ or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3572
3572
seq\_cst total orderings of other operations that are not marked
3573
3573
``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3574
3574
3575
+ .. _floatsem:
3576
+
3577
+ Floating-Point Semantics
3578
+ ------------------------
3579
+
3580
+ LLVM floating-point types fall into two categories:
3581
+
3582
+ - half, float, double, and fp128, which correspond to the binary16, binary32,
3583
+ binary64, and binary128 formats described in the IEEE-754 specification.
3584
+ - The remaining types, which do not directly correspond to a standard IEEE
3585
+ format.
3586
+
3587
+ For types that do correspond to an IEEE format, LLVM IR float operations behave
3588
+ like the corresponding operations in IEEE-754, with two exceptions: LLVM makes
3589
+ :ref:`specific assumptions about the state of the floating-point environment
3590
+ <floatenv>` and it implements :ref:`different rules for operations that return
3591
+ NaN values <floatnan>`.
3592
+
3593
+ This means that optimizations and backends cannot change the precision of these
3594
+ operations (unless there are fast-math flags), and frontends can rely on these
3595
+ operations deterministically providing perfectly rounded results as described
3596
+ in the standard (except when a NaN is returned).
3597
+
3575
3598
.. _floatenv:
3576
3599
3577
3600
Floating-Point Environment
@@ -3608,10 +3631,11 @@ are not "floating-point math operations": ``fneg``, ``llvm.fabs``, and
3608
3631
``llvm.copysign``. These operations act directly on the underlying bit
3609
3632
representation and never change anything except possibly for the sign bit.
3610
3633
3611
- For floating-point math operations, unless specified otherwise, the following
3612
- rules apply when a NaN value is returned: the result has a non-deterministic
3613
- sign; the quiet bit and payload are non-deterministically chosen from the
3614
- following set of options:
3634
+ Floating-point math operations that return a NaN are an exception from the
3635
+ general principle that LLVM implements IEEE-754 semantics. Unless specified
3636
+ otherwise, the following rules apply when a NaN value is returned: the result
3637
+ has a non-deterministic sign; the quiet bit and payload are
3638
+ non-deterministically chosen from the following set of options:
3615
3639
3616
3640
- The quiet bit is set and the payload is all-zero. ("Preferred NaN" case)
3617
3641
- The quiet bit is set and the payload is copied from any input operand that is
0 commit comments