Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-29409

spark drop partition always throws Exception

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.4.0
    • None
    • SQL
    • spark 2.4.0 on yarn 2.7.3

      spark-sql client mode

      run hive version: 2.1.1

      hive builtin version 1.2.1

    Description

      The table is:

      CREATE TABLE `test_spark.test_drop_partition`(
       `platform` string,
       `product` string,
       `cnt` bigint)
      PARTITIONED BY (dt string)
      stored as orc;

      hive 2.1.1:

      spark-sql -e "alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08')"

      hive builtin:

      spark-sql --conf spark.sql.hive.metastore.version=1.2.1 --conf spark.sql.hive.metastore.jars=builtin -e "alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08')"

      both would log Exception:

      19/10/09 18:21:27 INFO metastore: Opened a connection to metastore, current connections: 1
      19/10/09 18:21:27 INFO metastore: Connected to metastore.
      19/10/09 18:21:27 WARN RetryingMetaStoreClient: MetaStoreClient lost connection. Attempting to reconnect.
      org.apache.thrift.transport.TTransportException: Cannot write to null outputStream
       at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
       at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:178)
       at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:106)
       at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:70)
       at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.send_get_partitions_ps_with_auth(ThriftHiveMetastore.java:2433)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions_ps_with_auth(ThriftHiveMetastore.java:2420)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsWithAuthInfo(HiveMetaStoreClient.java:1199)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154)
       at com.sun.proxy.$Proxy30.listPartitionsWithAuthInfo(Unknown Source)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2265)
       at com.sun.proxy.$Proxy30.listPartitionsWithAuthInfo(Unknown Source)
       at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2333)
       at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:2359)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:560)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1$$anonfun$16.apply(HiveClientImpl.scala:555)
       at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
       at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
       at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
       at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
       at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply$mcV$sp(HiveClientImpl.scala:555)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:550)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$dropPartitions$1.apply(HiveClientImpl.scala:550)
       at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:275)
       at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:213)
       at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:212)
       at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:258)
       at org.apache.spark.sql.hive.client.HiveClientImpl.dropPartitions(HiveClientImpl.scala:550)
       at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply$mcV$sp(HiveExternalCatalog.scala:972)
       at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:970)
       at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$dropPartitions$1.apply(HiveExternalCatalog.scala:970)
       at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
       at org.apache.spark.sql.hive.HiveExternalCatalog.dropPartitions(HiveExternalCatalog.scala:970)
       at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropPartitions(ExternalCatalogWithListener.scala:203)
       at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropPartitions(SessionCatalog.scala:846)
       at org.apache.spark.sql.execution.command.AlterTableDropPartitionCommand.run(ddl.scala:546)
       at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
       at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
       at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
       at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:195)
       at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3365)
       at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
       at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
       at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
       at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3364)
       at org.apache.spark.sql.Dataset.<init>(Dataset.scala:195)
       at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:80)
       at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
       at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
       at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:62)
       at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:371)
       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
       at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:274)
       at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
       at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
       at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
       at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
       at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
       at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
      19/10/09 18:21:28 INFO metastore: Trying to connect to metastore with URI thrift://127.0.0.1:9083
      19/10/09 18:21:28 INFO metastore: Opened a connection to metastore, current connections: 2
      19/10/09 18:21:28 INFO metastore: Connected to metastore.
      Time taken: 3.715 seconds

      But this sql is normal

      spark-sql -e "select * from test_spark.test_drop_partition where dt='2019-10-08' limit 3; alter table test_spark.test_drop_partition drop if exists partition(dt='2019-10-08');"

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              ant_nebula ant_nebula
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: