Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-44728

Improve PySpark documentations

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Reopened
    • Major
    • Resolution: Unresolved
    • 3.5.0, 4.0.0
    • None
    • PySpark
    • None

    Description

      An umbrella Jira ticket to improve the PySpark documentation.
       
       

      Attachments

        1.
        Add canonical links to the PySpark docs page Sub-task Resolved BingKun Pan  
        2.
        Add Spark version drop down to the PySpark doc site Sub-task Resolved BingKun Pan  
        3.
        Add user guide for type mappings between Spark and Python data types Sub-task Resolved Philip Dakin  
        4.
        Add documentation for type casting rules in Python UDFs/UDTFs Sub-task Open Unassigned  
        5.
        Switch languages consistently across docs for all code snippets Sub-task Resolved BingKun Pan

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 1h
        6.
        Make Python the first language in all Spark code snippet Sub-task Resolved Unassigned  
        7.
        Refine DocString of `Union*` Sub-task Resolved Ruifeng Zheng  
        8.
        Refine docstring of `DataFrame.columns` property Sub-task Resolved Allison Wang  
        9.
        Refine docstring of `DataFrame.isEmpty` Sub-task Resolved Allison Wang  
        10.
        Refine docstring of `createDataFrame` Sub-task Resolved Allison Wang  
        11.
        Refine the docstring of `DataFrame.collect` Sub-task Resolved Allison Wang  
        12.
        Fix wildcard import `from pyspark.sql.functions import *` in `Quick Start` Examples Sub-task Resolved Ruifeng Zheng  
        13.
        Fix docstring of `monotonically_increasing_id` Sub-task Resolved Ruifeng Zheng  
        14.
        Enable Doctests of `rand`, `randn` and `log` Sub-task Resolved Ruifeng Zheng  
        15.
        Refine docstring of `approx_count_distinct` Sub-task Resolved Yang Jie  
        16.
        Some directories should be cleared when regenerating files Sub-task Resolved BingKun Pan  
        17.
        Refine docstring for DataFrame.approxQuantile Sub-task Resolved Michael Zhang  
        18.
        There should be a gap at the bottom of the HTML Sub-task Resolved BingKun Pan  
        19.
        Refine docstring of `DataFrame.filter` Sub-task Resolved Allison Wang  
        20.
        Align example order (Python -> Scala/Java -> R) in all Spark Doc Content Sub-task Resolved BingKun Pan  
        21.
        Refine docstring of `asc/desc` Sub-task Resolved Yang Jie  
        22.
        Refine docstring of `Column.between` Sub-task Resolved Allison Wang  
        23.
        Refine DocStrings of `try_{add, subtract, multiply, divide, avg, sum}` Sub-task Resolved Ruifeng Zheng  
        24.
        Refine docstring of `DataFrame.drop` Sub-task Resolved BingKun Pan  
        25.
        Refine docstring of `max` Sub-task Resolved Allison Wang  
        26.
        Refine docstring of `groupBy/rollup/cube` Sub-task Resolved BingKun Pan  
        27.
        Refine docstring of `DataFrame.distinct` Sub-task Resolved Allison Wang  
        28.
        Refine docstrings of `coalesce/repartition/repartitionByRange` Sub-task Resolved Ruifeng Zheng  
        29.
        Refine docstrings of `min_by/max_by` Sub-task Resolved Yang Jie  
        30.
        Refine docstring of `min` Sub-task Resolved Allison Wang  
        31.
        Refine docstring of `explode` Sub-task Resolved Allison Wang  
        32.
        Refine docstrings of `collect_list/collect_set` Sub-task Resolved Yang Jie  
        33.
        Adjust the `versionadded` and `versionchanged` information to the parameters Sub-task Resolved Ruifeng Zheng  
        34.
        Refine docstring of `inline` Sub-task Resolved Allison Wang  
        35.
        Automate updating versions. json Sub-task Open Unassigned  
        36.
        Refine docstring of `ceil/ceiling/floor/round/bround` Sub-task Resolved BingKun Pan  
        37.
        XML: Refine docstring of schema_of_xml Sub-task Resolved Hyukjin Kwon  
        38.
        Update Example with docker official image Sub-task Resolved Ruifeng Zheng

        100%

        Original Estimate - Not Specified Original Estimate - Not Specified
        Time Spent - 10m
        39.
        Refine docstring of `array/array_contains/arrays_overlap` Sub-task Resolved Yang Jie  
        40.
        Refine docstring of `Column.isin` Sub-task Resolved Allison Wang  
        41.
        Refine docstring of `DataFrame.withColumnRenamed` Sub-task Resolved Allison Wang  
        42.
        Refine docstring of `DataFrame.join` Sub-task Resolved Allison Wang  
        43.
        Refine docstring of `DataFrameReader.parquet` Sub-task Resolved Allison Wang  
        44.
        Refine docstring of `DataFrameReader.json` Sub-task Resolved Hyukjin Kwon  
        45.
        Refine docstring of `Column.when` Sub-task Resolved Hyukjin Kwon  
        46.
        Refine docstring of `rand/randn` Sub-task Resolved BingKun Pan  
        47.
        Refine DocString of `regr_*` functions Sub-task Resolved Ruifeng Zheng  
        48.
        Refine docstring of `sum` Sub-task Resolved Hyukjin Kwon  
        49.
        Refine docstring of `count` Sub-task Resolved Hyukjin Kwon  
        50.
        Refine docstring of count_distinct Sub-task Resolved Allison Wang  
        51.
        Configurable error when generating Python docs Sub-task Open Unassigned  
        52.
        python function categories should be consistent with SQL function groups Sub-task Resolved Ruifeng Zheng  
        53.
        Refine docstring of `create_map/slice/array_join` Sub-task Resolved Yang Jie  
        54.
        Add Matomo analytics to all released docs pages Sub-task Resolved BingKun Pan  
        55.
        Refine docstring of `DataFrame.show` Sub-task Resolved Allison Wang  
        56.
        Refine docstring of `options` for dataframe reader and writer Sub-task Resolved Hyukjin Kwon  
        57.
        Supplementary exception class Sub-task Resolved BingKun Pan  
        58.
        Make code block copyable Sub-task Resolved BingKun Pan  
        59.
        Refine docstring of `SparkSession.builder.config` Sub-task Resolved Allison Wang  
        60.
        Refine docstring of `lit` Sub-task Resolved Hyukjin Kwon  
        61.
        XML: Refine docstring of from_xml Sub-task Resolved Hyukjin Kwon  
        62.
        Add user guide for dataframe creation Sub-task Open Unassigned  
        63.
        Add a self-contained example about creating dataframe from jdbc Sub-task Open Unassigned  
        64.
        Add user guide for basic dataframe operations Sub-task Open Unassigned  
        65.
        Add user guide for column selections Sub-task Open Unassigned  
        66.
        Add user guide for groupby and aggregate Sub-task Open Unassigned  
        67.
        Add user guide for window operations Sub-task Open Unassigned  
        68.
        Refine docstring of `mapInPandas` Sub-task Resolved Allison Wang  
        69.
        Use built-in math constant in math functions Sub-task Resolved Ruifeng Zheng  
        70.
        Refine docstring of `DataFrame.substract` Sub-task Resolved Hyukjin Kwon  
        71.
        Refine docstring of `DataFrame.intersectAll` Sub-task Resolved Hyukjin Kwon  
        72.
        Refine docstring of `DataFrame.intersect` Sub-task Resolved Hyukjin Kwon  
        73.
        Refine docstring of `DataFrame.dropna/fillna/replace` Sub-task Resolved BingKun Pan  
        74.
        Improve basic datasource examples Sub-task Resolved Allison Wang  
        75.
        Document parameters and examples for RuntimeConf get, set and unset Sub-task Resolved Hyukjin Kwon  
        76.
        Refine docstring of UDTF Sub-task Resolved Hyukjin Kwon  
        77.
        Refine docstring of `concat/array_position/element_at/try_element_at` Sub-task Resolved Yang Jie  
        78.
        Add missing `toDegrees/toRadians/atan2/approxCountDistinct` function descriptions Sub-task Resolved Ruifeng Zheng  
        79.
        Correct the typing of schema_of_{csv, json, xml} Sub-task Resolved Ruifeng Zheng  
        80.
        Refine docstring of `array_prepend/array_append/array_insert` Sub-task Resolved Yang Jie  
        81.
        Refine docstring of `array_intersect/array_union/array_except` Sub-task Resolved Yang Jie  
        82.
        Refine docstring of `array_compact/array_distinct/array_remove` Sub-task Resolved Yang Jie  
        83.
        Refine docstring of `array_min/array_max/array_size/array_repeat` Sub-task Resolved Yang Jie  
        84.
        Refine docstring of `get/array_zip/sort_array` Sub-task Resolved Yang Jie  
        85.
        Refine docstring of `flatten/sequence/shuffle` Sub-task Resolved Yang Jie  
        86.
        Refine docstring for DataFrame.createTempView/createOrReplaceTempView Sub-task Resolved Hyukjin Kwon  
        87.
        Refine docstring for DataFrame.createGlobalTempView/createOrReplaceGlobalTempView Sub-task Resolved Hyukjin Kwon  
        88.
        Refine docstring for DataFrame.schema/explain/printSchema Sub-task Resolved Hyukjin Kwon  
        89.
        Refine docstring `reverse/map_contains_key` Sub-task Resolved BingKun Pan  
        90.
        Refine docstring of `map_from_arrays/map_from_entries/map_concat` Sub-task Resolved Yang Jie  
        91.
        Refine docstring of `parse_url/url_encode/url_decode` Sub-task Resolved Yang Jie  
        92.
        Refine docstring of `convert_timezone/make_dt_interval/make_interval` Sub-task Resolved BingKun Pan  
        93.
        Refine docstring `make_timestamp/make_timestamp_ltz/make_timestamp_ntz/make_ym_interval` Sub-task Resolved BingKun Pan  
        94.
        Refine docstring of `from_csv/schema_of_csv/to_csv` Sub-task Resolved Yang Jie  
        95.
        Refine docstring `aes_encrypt/aes_decrypt/try_aes_decrypt` Sub-task Resolved BingKun Pan  
        96.
        Refine docstring of `map_keys/map_values/map_entries` Sub-task Resolved Yang Jie  
        97.
        Refine docstring of `str_to_map/map_filter/map_zip_with` Sub-task Resolved Yang Jie  
        98.
        Refine docstring of `abs/acos/acosh` Sub-task Resolved Yang Jie  
        99.
        Refine docstring of `bit_and/bit_or/bit_xor` Sub-task Resolved Yang Jie  
        100.
        Refine docstring of `sum_distinct/array_agg/count_if` Sub-task Resolved Yang Jie  
        101.
        Refine docstring of `asc_nulls_first/asc_nulls_last/desc_nulls_first/desc_nulls_last` Sub-task Resolved Yang Jie  
        102.
        Refine docstring of `to_json/from_json` Sub-task Resolved Hyukjin Kwon  
        103.
        Refine docstring of `try_sum`, `try_avg`, `avg`, `sum`, `mean` Sub-task Resolved Hyukjin Kwon  
        104.
        Refine docstrings of try_* Sub-task Resolved Hyukjin Kwon  
        105.
        Improve docstring of mapInPandas Sub-task Resolved Xinrong Meng  

        Activity

          People

            Unassigned Unassigned
            allisonwang-db Allison Wang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 10m
                1h 10m