Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-46457

Refactor Python Test Suites

    XMLWordPrintableJSON

Details

    • Test
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.0.0
    • None
    • PySpark, Tests
    • None

    Description

      Some Python tests are large and slow, so they are not friendly to parallelism.

      We want to refactor them by grouping the test cases into topics (like Pandas test suites)

      Attachments

        1.
        Re-organize the resampling tests Sub-task Resolved Ruifeng Zheng
        2.
        Move test_parity_frame_resample and test_parity_series_resample to pyspark.pandas.tests.connect.resample Sub-task Resolved Ruifeng Zheng
        3.
        Re-organize `NumOpsTests` Sub-task Resolved Ruifeng Zheng
        4.
        Re-organize `StatsTests` Sub-task Resolved Ruifeng Zheng
        5.
        Remove unused code in `pyspark.pandas.tests.computation.*` Sub-task Resolved Ruifeng Zheng
        6.
        Remove unused code in `pyspark.pandas.tests.frame.*` Sub-task Resolved Ruifeng Zheng
        7.
        Remove unused code in `pyspark.pandas.tests.series.* ` Sub-task Resolved Ruifeng Zheng
        8.
        Remove unused code in `pyspark.pandas.tests.indexes.* ` Sub-task Resolved Ruifeng Zheng
        9.
        Reorganize `SeriesStringTests` Sub-task Resolved Ruifeng Zheng
        10.
        Reorganize EWMTests Sub-task Resolved Ruifeng Zheng
        11.
        Move `SeriesInterpolateTests` to `pyspark.pandas.tests.series.*` Sub-task Resolved Ruifeng Zheng
        12.
        Reorganize `DiffFramesParityCovCorrWithTests ` Sub-task Resolved Ruifeng Zheng
        13.
        Reorganize `RollingTests ` Sub-task Resolved Ruifeng Zheng
        14.
        Reorganize `FrameInterpolateTests` Sub-task Resolved Ruifeng Zheng
        15.
        Reorganize `ExpandingParityTests` Sub-task Resolved Ruifeng Zheng
        16.
        Reorganize `OpsOnDiffFramesDisabledTests` Sub-task Resolved Ruifeng Zheng
        17.
        Reorganize `ReshapeTests` Sub-task Resolved Ruifeng Zheng
        18.
        Re-organize `GroupByTests` Sub-task Resolved Haejoon Lee
        19.
        Reorganize `DatetimeIndexTests`: Factor out 3 slow tests Sub-task Resolved Ruifeng Zheng
        20.
        Move `test_window` to `pyspark.pandas.tests.window.*` Sub-task Resolved Ruifeng Zheng
        21.
        Reorganize `OpsOnDiffFramesGroupByTests` Sub-task Resolved Ruifeng Zheng
        22.
        Move IO-related tests to `pyspark.pandas.tests.io.*` Sub-task Resolved Ruifeng Zheng
        23.
        Reorganize `GroupbyStatTests` Sub-task Resolved Ruifeng Zheng
        24.
        Reorganize `OpsOnDiffFramesGroupByRollingTests` Sub-task Resolved Ruifeng Zheng
        25.
        Reorganize `OpsOnDiffFramesGroupByExpandingTests` Sub-task Resolved Ruifeng Zheng
        26.
        Move `test_series_datetime` to `pyspark.pandas.tests.connect.series.*` Sub-task Resolved Ruifeng Zheng
        27.
        Reorganize `OpsOnDiffFramesEnabledTests` Sub-task Resolved Ruifeng Zheng
        28.
        Reorganize `FrameParityPivotTests` Sub-task Resolved Ruifeng Zheng
        29.
        Move test_default_index to `pyspark.pandas.tests.indexes.*` Sub-task Resolved Ruifeng Zheng
        30.
        Factor slow tests out of `IndexesTests` Sub-task Resolved Ruifeng Zheng
        31.
        Split `IndexesSlowTests` into multiple tests Sub-task Resolved Ruifeng Zheng
        32.
        Move `BasicIndexingTests` to `pyspark.pandas.tests.indexes.*` Sub-task Resolved Ruifeng Zheng
        33.
        Reorganize `IndexingTest` Sub-task Resolved Ruifeng Zheng
        34.
        Refactor `data_type_ops` tests Sub-task Resolved Ruifeng Zheng
        35.
        Check the testing mode Sub-task Resolved Ruifeng Zheng
        36.
        Split `FrameTakeTests` Sub-task Resolved Ruifeng Zheng
        37.
        Split `GroupbyParitySplitApplyTests` Sub-task Resolved Ruifeng Zheng
        38.
        Split `ArithmeticTests` Sub-task Resolved Ruifeng Zheng
        39.
        Rebalance `pyspark_pandas` and `pyspark_pandas_slow` Sub-task Resolved Ruifeng Zheng
        40.
        Rebalance `pyspark_pandas_connect_part?` Sub-task Resolved Ruifeng Zheng
        41.
        Clean up the imports in `pyspark.pandas.tests.computation.*` Sub-task Resolved Ruifeng Zheng
        42.
        Clean up the imports in `pyspark.pandas.tests.{frame, series, groupby}.*` Sub-task Resolved Ruifeng Zheng
        43.
        Clean up the imports in pyspark.pandas.tests.indexes.* Sub-task Resolved Ruifeng Zheng
        44.
        Clean up the imports in `pyspark.pandas.test_*` Sub-task Resolved Ruifeng Zheng
        45.
        Split `test_split_apply_basic` Sub-task Resolved Ruifeng Zheng
        46.
        Factor session-related tests out of test_connect_basic Sub-task Resolved Ruifeng Zheng
        47.
        Factor out tests from `SparkConnectSQLTestCase` Sub-task Resolved Ruifeng Zheng
        48.
        Split `pyspark.sql.tests.test_dataframe` Sub-task Resolved Ruifeng Zheng

        Activity

          People

            Unassigned Unassigned
            podongfeng Ruifeng Zheng
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: