Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31306

rand() function documentation suggests an inclusive upper bound of 1.0

    XMLWordPrintableJSON

Details

    • Documentation
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.4.5, 3.0.0
    • 2.4.6, 3.0.0
    • PySpark, R, Spark Core
    • None

    Description

       The rand() function in PySpark, Spark, and R is documented as drawing from U[0.0, 1.0]. This suggests an inclusive upper bound, and can be confusing (i.e for a distribution written as `X ~ U(a, b)`, x can be a or b, so writing `U[0.0, 1.0]` suggests the value returned could include 1.0). The function itself uses Rand(), which is documented  as having a result in the range [0, 1).

      Attachments

        Issue Links

          Activity

            People

              bryves Ben
              bryves Ben
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: