Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-21488

Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Incomplete
    • 2.2.0
    • None
    • SQL

    Description

      It would be great to make saveAsTable() return dataframe of created table,
      so you could pipe result further as for example

      mv_table_df = (sqlc.sql('''
              SELECT ...
              FROM 
          ''')
              .write.format("parquet").mode("overwrite")
              .saveAsTable('test.parquet_table')
              .createOrReplaceTempView('mv_table')
          )
      

      ... Above code returns now expectedly:

      AttributeError: 'NoneType' object has no attribute 'createOrReplaceTempView'
      

      If this is implemented, we can skip a step like

      sqlc.sql('SELECT * FROM test.parquet_table').createOrReplaceTempView('mv_table')
      

      We have this pattern very frequently.

      Further improvement can be made if createOrReplaceTempView also returns dataframe object, so in one pipeline of functions
      we can

      • create an external table
      • create a dataframe reference to this newly created for SparkSQL and as a Spark variable.

      Attachments

        Activity

          People

            Unassigned Unassigned
            Tagar Ruslan Dautkhanov
            Votes:
            1 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: