Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-47444

Validate alter table statements and check the numeric table stats

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.8, 3.0.0
    • None
    • SQL

    Description

      SPARK-30262 resolved/avoided the NumberFormatException in Spark when the "totalSize", "numRows", "rawDataSize" table properties are empty, however the table stats (intentionally or by mistake) can be still set to an invalid/empty value through SparkSQL with an ALTER TABLE statement:

      scala> spark.sql("alter table t1p set tblproperties('numRows'='', 'STATS_GENERATED_VIA_STATS_TASK'='true')").show()
      

      Spark should validate the sparkSQL "alter table" statements and not allow non-numeric values in the "totalSize", "numRows", "rawDataSize" table properties.
      Though the NumberFormatException will not occur anymore in Spark 3.x, these table stats should have numeric values and may cause problems in other applications if those are not numbers.

      Note: beeline/Hive validates alter table statements.

      Attachments

        Activity

          People

            Unassigned Unassigned
            mszurap Miklos Szurap
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: