Skip to content

Commit fa37856

Browse files
SmebHyukjinKwon
authored andcommitted
[SPARK-31306][DOCS] update rand() function documentation to indicate exclusive upper bound
### What changes were proposed in this pull request? A small documentation change to clarify that the `rand()` function produces values in `[0.0, 1.0)`. ### Why are the changes needed? `rand()` uses `Rand()` - which generates values in [0, 1) ([documented here](https://github.com/apache/spark/blob/a1dbcd13a3eeaee50cc1a46e909f9478d6d55177/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/randomExpressions.scala#L71)). The existing documentation suggests that 1.0 is a possible value returned by rand (i.e for a distribution written as `X ~ U(a, b)`, x can be a or b, so `U[0.0, 1.0]` suggests the value returned could include 1.0). ### Does this PR introduce any user-facing change? Only documentation changes. ### How was this patch tested? Documentation changes only. Closes apache#28071 from Smeb/master. Authored-by: Ben Ryves <[email protected]> Signed-off-by: HyukjinKwon <[email protected]>
1 parent 47c810f commit fa37856

File tree

3 files changed

+4
-4
lines changed

3 files changed

+4
-4
lines changed

R/pkg/R/functions.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2975,7 +2975,7 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),
29752975

29762976
#' @details
29772977
#' \code{rand}: Generates a random column with independent and identically distributed (i.i.d.)
2978-
#' samples from U[0.0, 1.0].
2978+
#' samples uniformly distributed in [0.0, 1.0).
29792979
#' Note: the function is non-deterministic in general case.
29802980
#'
29812981
#' @rdname column_nonaggregate_functions

python/pyspark/sql/functions.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -652,7 +652,7 @@ def percentile_approx(col, percentage, accuracy=10000):
652652
@since(1.4)
653653
def rand(seed=None):
654654
"""Generates a random column with independent and identically distributed (i.i.d.) samples
655-
from U[0.0, 1.0].
655+
uniformly distributed in [0.0, 1.0).
656656
657657
.. note:: The function is non-deterministic in general case.
658658

sql/core/src/main/scala/org/apache/spark/sql/functions.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1227,7 +1227,7 @@ object functions {
12271227

12281228
/**
12291229
* Generate a random column with independent and identically distributed (i.i.d.) samples
1230-
* from U[0.0, 1.0].
1230+
* uniformly distributed in [0.0, 1.0).
12311231
*
12321232
* @note The function is non-deterministic in general case.
12331233
*
@@ -1238,7 +1238,7 @@ object functions {
12381238

12391239
/**
12401240
* Generate a random column with independent and identically distributed (i.i.d.) samples
1241-
* from U[0.0, 1.0].
1241+
* uniformly distributed in [0.0, 1.0).
12421242
*
12431243
* @note The function is non-deterministic in general case.
12441244
*

0 commit comments

Comments
 (0)