Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31085 Amend Spark's Semantic Versioning Policy
  3. SPARK-31136

Revert SPARK-30098 Use default datasource as provider for CREATE TABLE syntax

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Blocker
    • Resolution: Won't Do
    • 3.0.0
    • None
    • SQL
    • None

    Description

      We need to consider the behavior change of SPARK-30098 .
      This is a placeholder to keep the discussion and the final decision.

      `CREATE TABLE` syntax changes its behavior silently.

      The following is one example of the breaking the existing user data pipelines.
      Apache Spark 2.4.5

      spark-sql> CREATE TABLE t(a STRING);
      
      spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t;
      
      spark-sql> SELECT * FROM t LIMIT 1;
      # Apache Spark
      Time taken: 2.05 seconds, Fetched 1 row(s)
      
      spark-sql> CREATE TABLE t(a CHAR(3));
      
      spark-sql> INSERT INTO TABLE t SELECT 'a ';
      
      spark-sql> SELECT a, length(a) FROM t;
      a  	3
      

      Apache Spark 3.0.0-preview2

      spark-sql> CREATE TABLE t(a STRING);
      
      spark-sql> LOAD DATA INPATH '/usr/local/spark/README.md' INTO TABLE t;
      Error in query: LOAD DATA is not supported for datasource tables: `default`.`t`;
      
      spark-sql> CREATE TABLE t(a CHAR(3));
      
      spark-sql> INSERT INTO TABLE t SELECT 'a ';
      
      spark-sql> SELECT a, length(a) FROM t;
      a 	2
      

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              dongjoon Dongjoon Hyun
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: