Details
-
Improvement
-
Status: Resolved
-
Minor
-
Resolution: Incomplete
-
2.2.0
-
None
Description
It would be great to make saveAsTable() return dataframe of created table,
so you could pipe result further as for example
mv_table_df = (sqlc.sql(''' SELECT ... FROM ''') .write.format("parquet").mode("overwrite") .saveAsTable('test.parquet_table') .createOrReplaceTempView('mv_table') )
... Above code returns now expectedly:
AttributeError: 'NoneType' object has no attribute 'createOrReplaceTempView'
If this is implemented, we can skip a step like
sqlc.sql('SELECT * FROM test.parquet_table').createOrReplaceTempView('mv_table')
We have this pattern very frequently.
Further improvement can be made if createOrReplaceTempView also returns dataframe object, so in one pipeline of functions
we can
- create an external table
- create a dataframe reference to this newly created for SparkSQL and as a Spark variable.